{"id":324,"date":"2022-03-20T13:16:51","date_gmt":"2022-03-20T17:16:51","guid":{"rendered":"http:\/\/khashanlab.org\/?page_id=324"},"modified":"2022-03-20T13:36:49","modified_gmt":"2022-03-20T17:36:49","slug":"cheminformatics","status":"publish","type":"page","link":"http:\/\/khashanlab.org\/cheminformatics\/","title":{"rendered":"Cheminformatics"},"content":{"rendered":"\n
Refinement of Molecular Descriptors for Diversity Analysis of Chemical Libraries<\/strong><\/p>\n\n\n\n The search for lead compounds using computational tools (aka, virtual screening) requires chemical libraries. But there are many chemical libraries and it is time-consuming to search all of them. Therefore, a tool is required to decide on the similarity of these libraries and assess in selecting a diverse set of compounds that is representative of all available libraries. Such a tool uses molecular descriptors (such as BCUT) to perform the task. Herein, we refined these descriptors and verified that their performance has improved.<\/p>\n\n\n\n <\/p>\n\n\n\n Development of Fragment-based Molecular Descriptors using Subgraph Mining<\/strong><\/p>\n\n\n\n Molecular descriptors are also useful for performing quantitative structure-activity relationship (QSAR) studies as well. In these studies, a regression (or classification) model is generated to relate a set of molecular descriptors to the activity of the molecules in the study. The model can then be used to predict the activity of other molecules in a virtual screening process. Molecular descriptors are of many kinds, the fragment-based ones use counts of chemical fragments to describe molecules. Thus, they afford a mechanistic interpretation of the results in terms of essential pharmacophoric (or toxicophoric) elements responsible for the activity (or toxicity) of molecules. Herein, we developed fragment-based molecular descriptors using frequent subgraph mining (FSM); we used a labeled chemical graph representation of molecules and employed FSM to identify chemical-fragments (subgraphs) that occur in at least a fraction (\u03c3) of all molecules in a dataset, see the following figure. We demonstrated (using variable-selection QSAR modeling) that the identified molecular descriptors afford higher discriminatory ability in classifying compounds compared to other commonly used molecular descriptors.<\/p>\n\n\n\n