Font Size: a A A

Applications Of Linear And Nonlinear Methods In QSAR/QSPR

Posted on:2008-05-02Degree:DoctorType:Dissertation
Country:ChinaCandidate:W P MaFull Text:PDF
GTID:1101360215957788Subject:Analytical Chemistry
Abstract/Summary:PDF Full Text Request
Quantitative structure-activity/property relationships (QSPR/QSAR) studies have been widely used for prediction of various physicochemical properties and biological activities of organic compounds by different statistical methods and various kinds of molecular descriptors. They are important research topics in computational chemistry and chemoinformatics. QSPR/QSAR has been widely applied to predict different physicochemical properties, biological activities, toxicities, metabolic dynamic parameters of drugs. Now, the method has been introduced to drug design, analytical chemistry, environment science, food science and material science.In the past ten years, the mechanics of error-back propagation network (BP), radial basis function neural networks (RBFNN) and support vector machine (SVM) have been profoundly studied by our group. These nonlinear methods have been widely used in drug design, analytical chemistry, environment science, food science and material science. And accurate QSAR/QSPR models were achieved. Research focusing on two aspects the method is presented in this thesis. The first area of the research is achieving satisfying results with simple linear models; the second area of the research is developing nonlinear models to achieve more accurate results.In Chapter 1, an brief introduction of the QSPR/QSAR history, methodology and situation is presented. And an overview of the wide applications of QSPR/QSAR in drug design, analytical chemistry, environment science, food science and material science is given.In Chapter 2, we applied linear method in QSAR. A brief description was given as follows:(1) A quantitative structure-activity relationship (QSAR) model was developed by the heuristic method (HM) to study the penetration of 245 drugs across a polydimethylsiloxane (PDMS) membrane. The descriptors of this study were calculated by the software CODESSA. HM was used both for preselecting molecular descriptors and for developing the linear model. Logarithms of the maximum steady state flux (J) values are correlated with four descriptors, with a squared correlation coefficient (R2) of 0.844 and root-mean-square (RMS) error of 0.438, respectively.This paper provides a simple and straightforward way to predict the logJ values of the drugs from their structures alone and gives some insight into structural features related to the penetration of drugs. In Chapter 3, nonlinear methods were applied to QSPR study:(1) The retention factors (logk) in the biopartitioning micellar chromatography (BMC) of 79 heterogeneous pesticides were studied by QSPR method. HM and SVM method were used to build linear and nonlinear models, respectively. Compared the results of these two methods, those obtained by the SVM model are much better. RMS errors of SVM and HM for the test set were 1.094 and 1.644, respectively. The proposed QSPR models, by the two methods, contain the same descriptors that agree with the classical Abraham parameters of well-known linear solvation energy relationships (LSER).(2) SVM and HM were used to develop nonlinear and linear models between the solubility of 217 nonelectrolytes in electrolyte containing sodium chloride and three molecular descriptors. The molecular descriptors representing the structural features of the compounds include two topological and one electrostatic descriptor. The three molecular descriptors selected by HM in CODESSA were used as inputs for SVM. The results obtained by HM and SVM both were satisfactory. The model of HM leads to a correlation coefficient (R) of 0.980 and root-mean-square error (RMS) of 0.219 for the test set. Furthermore, a predictive correlation coefficient R = 0.988 and RMS error of 0.170 for the test set were obtained by SVM. The prediction results are in very good agreement with the experimental values. The same descriptors were also employed to build the model in pure water, and the prediction results were consistent with the experimental solubilities. This paper provided a new and effective method for predicting the solubility in electrolyte and revealed some insight into the structural features that are related to the noneletrolytes.(3) The aim of this work was to predict electrophoretic mobilities of peptides in capillary zone electrophoresis (CZE) using HM and RBFNN. Two data sets, consisting of 125 peptides ranging in size between 2 and 14 amino acids and 58 peptides ranging in size between 2 and 39 amino acids, are researched to test applicability of the QSPR methods. In this study, RMS errors of the training set, the test set and the whole set of data set 1 are 1.3766, 1.5608, 1.4157 and R2 are 0.9740, 0.9671 and 0.9724 predicted by RBFNN, respectively. While the RMS errors of the training set, the test set and the whole set of data set 2 is 0.6279, 0.8145, 0.6673 and R2 are 0.9773, 0.9489 and 0.9732, respectively. So the Offord's charge-over-mass term (Q/M2/3) combined with descriptors calculated by CODESSA represents the structural features of the peptides appropriately. The electrophoretic mobilities of peptides can be accurately predicted by the linear and nonlinear model.(4) RBFNN and HM were used to develop models between the bioconcentration factors (BCF) and three molecular descriptors of 121 nonionic organic compounds. The three molecular descriptors representing the structural features of the compounds were selected by HM in CODESSA, which include topological, geometrical and electrostatic descriptor and were used as inputs for RBFNN. The results obtained by HM and RBFNN both were satisfactory. The model of HM leads to R2 of 0.888 and RMS of 0.551 for the test set. Furthermore, a predictive R2 of 0.923 and RMS error of 0.416 for the test set were obtained by RBFNN. The prediction results are in very good agreement with the experimental values. This paper provided an effective method for predicting the BCF and revealed some insight into the structural features that are related to the BCF of nonionic organic compounds.
Keywords/Search Tags:Computational chemistry, Chemometrics, QSPR/QSAR, SVM, RBFNN, HM
PDF Full Text Request
Related items