Font Size: a A A

Prediction Of Toxicity By Structure And Activity Relationship

Posted on:2014-01-04Degree:DoctorType:Dissertation
Country:ChinaCandidate:M ZhongFull Text:PDF
GTID:1224330398486920Subject:Chemical Engineering and Technology
Abstract/Summary:PDF Full Text Request
Carcinogenicity and mutagenicity are the most concernedtoxicological endpoints closely related to human health, and block of thehuman ether-a-go-go related gene (hERG) potassium ion channel is oneof the major factors related to severe cardiotoxicity leading to long QTsyndrome (LQTS), and it is a predisposing factor for syncope and suddendeath. In this study, we focused on three parts:(1) Thirteen classification models were built for the purpose ofdistinguishing carcinogenicity based on a dataset of852non-congenericchemicals derived from the Carcinogenic Potency Database (CPDBAS).24MOE molecular descriptors were selected via Pearson correlation,F-score and stepwise regression analysis, which could be classified intosix classes based on electrophilicity, geometry, molecular weight and size,and solubility of chemicals, etc. The descriptor mutagenic showed thehighest correlation coefficient with carcinogenicity. Based on thesedescriptors, Support Vector Machines (SVM) method was applied to develop a classification model and then fine-tuned by10-foldcross-validation. Both the SVM model (Model A1) and the best modelfrom10-fold cross-validation (Model B3) gave good results on the testset with prediction accuracy over80%, sensitivity over76%, andspecificity over82%. In addition, extended connectivity fingerprints(ECFP) and Toxtree software were used to further analyze the functionalgroups or substructures related to the carcinogenicity of chemicals, and agood matching had been found from the results by these two methods.(2) Six classification models were built for the purpose ofdistinguishing mutagenicity based on a dataset of565non-congenericchemicals derived from the Carcinogenic Potency Database (CPDBAS).The split of training set and test set was via Self-Organizing Map (SOM)and randomization, and the descriptor selection is done by Pearsoncorrelation/Stepwise regression analysis, F-score and Weka. Predictionaccuracy of88.46%was achieved by the best model (Model21), anddescriptors used by at least three models were analyzed and discussed.(3) Four hERG potassium ion channel blocker’s IC50predictionmodels were built based on a dataset of343compounds.16MOEdescriptors were selected and analyzed in depth. The split of training setand test set was via Self-Organizing Map (SOM) and randomization. Themodels were built by Multilinear Regression (MLR) and Support VectorMachine (SVM) methods. Comparing the four models, we found models built by SVM gave much better prediction results than those of MLRmodels.In summary, chemoinformatics approaches were utilized for theprediction of toxicity of chemicals in the three parts of study and goodresults were obtained.
Keywords/Search Tags:Carcinogenicity, Mutagenicity, hERG Potassium Ion ChannelBlocker, Support Vector Machine (SVM), Kohonen’s Self-OrganizingMap (SOM), Extended Connectivity Fingerprints (ECFP), StructuralAlert (SA)
PDF Full Text Request
Related items