Font Size: a A A

A QSPR Study For The Sensitization And Polarity Parameter Of Some Organic Compounds

Posted on:2018-01-16Degree:MasterType:Thesis
Country:ChinaCandidate:J WangFull Text:PDF
GTID:2321330518492215Subject:Chemistry
Abstract/Summary:PDF Full Text Request
As the continuous development of modern industry,various aspects of people's life are greatly influenced.For example,some of the organic compounds may be harmful to the physical health of human beings.Consequently,people concern about the influence and study of the organic compounds more and more.It is necessary to make a study on the property of organic compounds through quantitative structure-property relationship in order to not only reduce the bad impact of animal experiments,the economic cost and time but also ensure the reliability of the research at the same time.In the process of establishing a model to predict the property of the compounds,this thesis unities three classified methods?KNN,KMC and PP?with three modeling methods?MLR,PLS and ANN?together to establish QSRP models on the sensitization and the polarity parameters of some organic compounds.The dissertation mainly contains the following parts:?1?The QSPR study of this paper was performed for some sensitization organic compounds with K-nearest neighbor method,k-means clustering method and projection pursuit three classification methods :From the database in this paper the NTP?the National Toxicology Program?to collect local lymph node assay?LLNA?186 sensitization organic compounds with the same carrier sensitization information as the research sample,using the software ADMEWORKS ModelBuilder calculated and selected descriptors,and then selected descriptors for the calculation of relative standard deviation,the final out seven structure descriptors as sample to study the structure of the parameter.The 186 samples application robust diagnostic method to eliminate outliers.The K-nearest neighbor,k-means clustering and projection pursuit three kinds of classification methods to classify118 samples.In each category,sphere exclusion algorithm was applied to split the sample into training set and testing set.Then,the samples classified were applied to generate QSPR models using multivariate linear regression?MLR?,partial least squares?PLS?and artificial neural network?ANN?,respectively.?2?The QSPR study of this paper was performed for some some organic compounds polarity parameters with KNN,k-means clustering method and projection pursuit three classification methods:Combined with selected from the literature of 250 organiccompounds polarity parameters of sample data,using the software ADMEWORKS ModelBuilder calculation out seven structure descriptors as sample research structure parameters.The 250 samples application robust diagnostic method to eliminate outliers..The K-nearest neighbor,k-means clustering and projection pursuit three kinds of classification methods to classify 225 samples.In each category,sphere exclusion algorithm was applied to split the sample into training set and testing set.Then,the samples classified were applied to generate QSPR models using multivariate linear regression?MLR?,partial least squares?PLS?and artificial neural network?ANN?,respectively.?3?In this paper,we use the structure of the compound similarity formula: cos?=?·?/???·???,? and ? represent the structure of the two samples descriptor vector,??? and ??? represent the vector norm.The formula of relative standard deviation is:RSD=???×100%,SD represent standard deviation.Adopts the structure similarity formula and the relative standard deviation formula of sensitization of organic compounds and organic polarity parameters for structural similarity and structure similarity of the calculation of the relative standard deviation.Through contrast is used for modeling the structure of the compounds similar degree and relative standard deviation,to judge the compound structure similarity on the result of modeling.?4?After classification methods to classify the sample,through three modeling methods respectively to modeling forecasting samples,Model predicted results and experimental values of two sets of data error was calculated by the formula(Error=(?(valuepre-valueexp)2)/N) .This can describe prediction results more accurate and intuitive.Effectively compared three kinds of classification methods and the advantages and disadvantages of three kinds of modeling method.According to the above QSPR research results show that,three kinds of classification methods can effectively improve the model of organic compounds sensitization samples and of organic compounds polarity parameters sample prediction.Will be shown after the KNN and k-means clustering classification of predicting results are good and the other kind of forecast results are relatively poor,and the projection pursuit classification after the prediction results of compound samples are better than unclassified compounds of predicted results.From the perspective of the predicted that higher similarity of sample compounds polarity parameters prediction results of organic compounds is lower than sample compounds similarity of organic compounds of sensitization prediction result isgood.Although sample compound structure similarity and prediction results and has no strict relation,but improve compound classification of similarity also effectively improve the modeling of forecasting results.
Keywords/Search Tags:Sensitization, Polarity parameter, Projection Pursuit, K-nearest neighbor, K-means clustering, the structural similarity of organic compounds, QSPR
PDF Full Text Request
Related items