Font Size: a A A

The Application Of QSPR/QSAR In The Prediction Of The Hazard Properties Of Organic Compounds

Posted on:2013-07-25Degree:MasterType:Thesis
Country:ChinaCandidate:J LiFull Text:PDF
GTID:2251330425971959Subject:Safety Technology and Engineering
Abstract/Summary:PDF Full Text Request
In recent years, with the rapid development of China’s economy and the chemical industry, a growing number of hazard chemicals appear in the production, management, transport and usage process, which will undoubtedly bring potential threats to human and society. Therefore it becomes increasingly important to assess the hazards of chemicals. Physical-chemical properties are essential attributes for the evaluation of hazard chemicals. Due to various reasons, there is no such complete database to record the chemicals’physical-chemical properties. Quantitative Structure-Property/Activity Relationship (QSPR/QSAR) provides a reliable means for assessing the hazard chemicals. Once a reliable model is established, it can then be used to predict various properties of new materials and even compounds yet not synthesized, and micro molecular structure nature can be understood as well. It is constructive to the design of new molecules.In this study, molecule descriptors are selected by the genetic function algorithm (GFA), using multiple linear regression (MLR) method to create a linear model; then non-linear model are built by BP neural network (BPNN) and support vector machines (SVM), and the results are satisfactory. Concrete main contents are listed as follows:The first chapter presents the basic principles of QSPR/QSAR, research steps and research progress, and detailed explanation of the basic principles of BPNN and SVM are introduced.The second chapter establishes a QSAR model which is used to study the acute toxicity of fatty compounds. Molecular descriptors are selected by GFA, using the MLR and BPNN to establish the linear and nonlinear models of acute toxicity and molecular descriptors. Multiple correlation coefficient (R2) of the test set are0.760and0.814, and the average absolute error (AAE) are0.314mmol/L and0.296mmol/L respectively. Results show that the non-linear model fitting and forecasting accuracy are better than the linear model. The method provides a way to predict the acute toxicity of fatty compounds based on the molecular structure. The third chapter predicts lower flammable limit of1056different kinds of organic compounds. Four structural parameters closely related to the lower flammable limits are seletced through GFA. MLR and BPNN method are used to establish the linear and nonlinear models. Multiple correlation coefficient (R2) of the test set are0.956and0.978, respectively, the root mean square error (RMSE) are0.107vol%and0.077vol%respectively. The results show that the BPNN model performance is better than the MLR.The fourth chapter establishes the relationship of flash point and molecular structure of91fatty alcohol compounds by MLR and SVM methods. In the test set, multiple correlation coefficients (R2) are0.976and0.979, and the average absolute error (AAE) are2.870K and2.706K. The results show that three descriptor selected through GFA can be a good characterization of the flash point of the fatty alcohol compounds.In the fifth chapter, QSPR method is used to study the quantitative structure-property relationship of liquid hydrocarbons. Correlation model of heat combustion with three descriptors of liquid hydrocarbons is established by application of MLR and SVM approach. The multiple correlation coefficient (R2) of the test sets are0.992and0.993, and the average absolute error (AAE) are121kJ/mol and88kJ/mol respectively. This method provides an effective way to predict the heat combustion of liquid hydrocarbon.
Keywords/Search Tags:chemoinformatics, quantitative structure-property/activityrelationship, genetic function approximation, back-propagation network, support vector machines
PDF Full Text Request
Related items