| The thermal safety plays a crucial role in chemical safety production. The research on the chemical safety parameters, such as viscosity, heat capacity, thermal conductivity, and autoignition temperature, is of important theoretical and practical significance to the chemical and petrochemical industry. Based on the basic principle of Quantitative Structure-Property Relationship (QSPR) and from the point of molecular structure, QSPR models of viscosity, heat capacity, thermal conductivity, and autoignition temperature were built. Meanwhile, the structure factors affecting properties were explored deeply. The main research contents and conclusions were as follows.(1) The molecular structure descriptors were calculated by using Dragon 2.1 software. The categories had been identified through Iterative Self-Organizing Data Analysis Techniques Algorithm (ISODATA), from which some compounds were selected randomly as the training set to build the QSPR model. Genetic Algorithm (GA) and Ant Colony Algorithm (ACO) were respectively used to sift the molecular structure descriptors, resulting in the selection of feature descriptors. Then we explored the structure factors affecting the property. With characteristic descriptors as input parameters of the model, Multiple Linear Regressions (MLR) and the Support Vector Machine (SVM) techniques were then employed to establish GA-MLR, GA-SVM, ACO-MLR and ACO-SVM models, respectively. Through evaluating parameters of the model, the validation of the models had been fully assessed. The application domain of the model was also studied with a Williams’s graph, which showed that the standard residual and the leverage value of some compounds were beyond the corresponding threshold values. Hence, the cause of "outliers" compounds was analyzed. Compared with the model in the literature, the models in the paper showed certain superiority.(2) The quantitative relationship between viscosity and molecular structures of organic compounds was investigated according to QSPR principle. The results were displayed in the following.310 compounds were classified into eleven categories through ISODATA and 248 compounds were used as the training set to build the viscosity models. The feature descriptors were screened by GA and ACO. The results proved that the entropy of solution and hydrophilic group were the dominant structural factors that were relevant to the viscosity of organic compounds. What’s more, the nOH descriptor screened by GA clearly pointed out that the hydrophilic group was hydroxyl. The R2 of the four models all reached more than 0.75, therefore, all the built viscosity models were determined to be satisfying. GA-SVM model was obviously superior to GA-MLR model, and ACO-SVM model was clearly better than ACO-MLR model. The R2 of GA-MLR and GA-SVM were about 0.90. which indicated that the feature structures influencing the viscosity were screened well by GA.(3) The QSPR models of the heat capacity were built by using similar methods. The results were showed in the following.650 compounds were classified into sixteen categories through ISODATA and 520 compounds were used as the training set. GA and ACO were applied as the screening molecular descriptors method to obtain five descriptors, respectively. And we obtained the same descriptor (SlK) which was the largest contribution to the heat capacity. The descriptor mainly reflected the impact of hybridization atom and hybrid state to the molecular shape. The R2 of the excellent heat capacity models were more than 0.90. Through the models comparison we found that GA-MLR model was obviously superior to ACO-MLR model, and so was GA-SVM model to ACO-SVM model. The R2 of GA-MLR and GA-SVM reached more than 0.95, so the predicted result was very satisfactory.(4) The QSPR models of the thermal conductivity were built in the same way. The results were illustrated in the following.178 compounds were classified into ten categories through ISODATA and 142 compounds were used as the training set. The feature descriptors obtained by GA mainly reflected the number of fluorine atom, the molecular polarizability, atom pair and the molecular charge-transfer, Van der Waals volume size, and so on. The feature descriptors were employed as the inputting parameters of the models, and GA-MLR and GA-SVM models were developed. The results showed that the R2 were both above 0.70 and they were good predictive models. In addition, the latter model was superior to the former, which showed that SVM was a rapid, effective and accurate method for forecasting thermal conductivity.(5) The QSPR results of the autoignition temperature were in the following.265 compounds were classified into five categories through ISODATA and 212 compounds were used as the training set. The feature descriptors obtained by GA demonstated that molecular dimension, the number of branched chain and the stereo space structure were dominant factors of the autoignition temperature. Taking the feature descriptors as the model input parameters, we obtained GA-MLR and GA-SVM models. The R2 of the above two models were both more than 0.70 and the result was acceptable. ACO-SVM model was built with the input parameters screened by ACO and the R2 of the training set and the test set were both more than 0.80, which revealed that the strong nolinear relationship existed between the autoignition temperature and the feature descriptors sifted by ACO. In addition, GA-SVM and ACO-SVM were of higher quality than GA-MLR, which showed that the strong nolinear relationship existed between the autoignition temperature and the feature descriptors. |