Machine Learning Method For Drug Screening Based On Structural Information And Clinical Effect

Posted on:2023-04-06

Degree:Master

Type:Thesis

Country:China

Candidate:C Zhong

Full Text:PDF

GTID:2531306794992789

Subject:Chemical Engineering and Technology

Abstract/Summary:

PDF Full Text Request

Due to serious side effects and pathogen resistance for drugs,researchers need to screen and discover new drugs with good efficacy persistently.Nowadays,the innovation of domestic new drug research and development is at a critical period.It is urgent to develop proper method to discover drugs with specific therapeutic effect.By using data-driven methods,correspondence between the molecular structure and specific clinical therapeutic effect can be established.Thereby,compounds that may have target effects can be discovered from a massive compound database.In this paper,drug information for 1132 drugs with seven classes therapeutic effects,that are needed to be innovated,is collected from databases,such as KEGG,Drug Bank,Pub Chem,etc.Five types of drug information,that is most relevant with drug molecular structure and therapeutic effect,is collected as original drug information set.After the drug information is checked by consulting literatures,four molecular sets containing different drug structure information are obtained.According to detail drug information in four molecular sets,the better molecular set is determined to predict unknown drugs.In order to preferably classify drugs,it is necessary to digitally describe drug molecular structure.A Chemo Py-RDKit(C-R)molecular description is proposed,and it is compared with four existing different molecular descriptions.In terms of classification methods,the performance of five common supervised algorithms is compared.Then,according to comparison result,they are fused by Dempster-Shafer evidence theory.In addition,external validation molecular set is used to verify the performance of classification method,so that its accuracy can be ensured.Finally,the best classification result is achieved based on molecular set that contains 844 molecular structures most relevant to drug efficacy.At the same time,the results of single classifiers demonstrate that the highest classification accuracy is obtained by the proposed C-R description.Moreover,the highest recognition rate is achieved by support vector machine(SVM)among five single classification methods.Compared with SVM,the method obtained by fusing SVM and random forest achieves further improvement in classification performance.These results prove that correspondence between the drug structure and therapeutic effect can be extracted by data-drive methods.Compounds with target effect can be discovered from a massive compound database.A reliable prediction for unknown drugs is able to be provided.Thus,the early drug development can be processed faster and more economically.

Keywords/Search Tags:

drug classification, molecular descriptors, fusion model, feature selection

PDF Full Text Request

Related items

1	LIBS Combined With Feature Fusion Method Can Improve The Accuracy Of Coal Classification
2	Development Of COX-2 Inhibitor Models Using Machine Learning Methods
3	Design And Implementation Of A Rice Classification Algorithm Based On Fusion Features
4	Clothing Classification Algorithm Based On Attention Mechanism And Feature Fusion
5	Research On Classification And Recognition Method Of Weld Defects Based On Feature Fusion
6	Research On Fabric Image Retrieval Based On Multi-feature Fusion And SVM Classification
7	Feature Selection Algorithm For Imbalanced Data And Its Application In Pixel Classification Of Sandstone Images
8	Research On Glass Products Sorting Method Based On Feature Fusion
9	Classification Of Rice Origin In Heilongjiang Province Based On Cross-media Feature Fusion
10	Research On Feature Extraction Of Welding Defects For Aluminum Alloy During Pulsed GTAW Process Using Multisensor-based Information Fusion