| Experimental detection of the absolute configuration of chiral compounds is often expensive and time-consuming,however,the theoretical prediction of specific optical rotation can assist in solving this problem.In this thesis,the absolute configurations of enantiomers were automatically identified and the specific optical rotation values of chiral molecules were quantitatively predicted by machine learning methods.The research details are as follows.1.Automatic identification of the absolute configurations of chiral ionic liquidsThe PAS descriptors encoding the chiral cationic structures were combined with the binary indicator variables representing the achiral anions to form the molecular chiral descriptors.The Counter-propagation Neural Network(CPG NN)was used to predict the specific optical rotation.The maps of the output layers clearly showed that the PAS descriptors have the ability to distinguish between levorotatory and dextrorotatory compounds,and assign the compounds with high absolute specific optical rotation to the specific regions.Moreover,the CPG NNs revealed the diversity of chemical spaces covered by ionic liquids containing different anions and visualized the relationship between cations,anions and specific optical rotations.The result of the final quantitative prediction of CPG NN is: RMSE = 22° for the test set.Based on the same data set and chirality descriptors,the CPG network correctly identified the absolute configuration of most of the enantiomers in the test set.Since the PAS descriptors belong to the classification index and the supervised machine learning methods are often better than semi-supervised CPG NN in quantitative prediction,we suggested quantitative ePAS descriptors and utilized the Multilayer Perceptron(MLP),Random Forest(RF)and Multiple Linear Regression(MLR)to construct quantitative prediction models.Thereinto,the model of best result is: the combination of PAS and ePAS was submitted to the RF to perform variable selection,and then the RF model was built using the most relevant 30-dimensional descriptors.Finally the corresponding RMSE of the training set and the test set were between 10° and 11°.The obtained quantitative results were also significantly better than those of PAS.If RF was used to qualitative prediction,the obtained model can correctly identify the absolute configuration of the 95% enantiomers in the test set.2.Specific optical rotation prediction of chiral fluoridesThe PAS descriptors were used to represent 44 chiral fluoride enantiomers and build qualitative and quantitative prediction models for specific optical rotations.For qualitative prediction,the signs of their specific optical rotations were presented by +1 or-1 as the output of CPG NN.The distribution of compounds in the training set on the map verified the ability of the PAS descriptors to distinguish between levorotatory and dextrorotatory fluorides.The PAS descriptors of the test set were also mapped into the trained CPG NN,and the 8 pairs of enantiomers of the test set were displayed on the activated neurons and correctly classified.For the leave one-pair out cross-validation of the whole data set,the 41 pairs of 44 enantiomeric pairs were correctly identified.The above results indicated that the established qualitative model was satisfactory and can correctly identify most of the L-compounds and D-compounds.The PAS,PAS+ePAS and cPAS descriptors were individually used to represent the structures of the chiral fluorinated molecules and to construct the quantitative models.Since irrelevant variables may increase the computational complexity and result in decreased classification accuracy,we selected descriptors for quantitative structure-activity relationship studies based on the variable importance of RF.Among them,the subset composed of 11 descriptors was obtained for cPAS descriptors derived from the common structural features of chiral fluorinated molecules.The RF model was constructed by this subset,and it yielded the best quantitative prediction results.For the leave one-pair out cross-validation of the whole data set,the R is 0.969 and RMSE is 11.4 °.In addition,the specific optical rotations of 30 compounds in the data set were measured in chloroform.We used machine learning methods to predict their specific optical rotations and compared them with the results of quantum chemistry in the literature.The results showed that the machine learning methods can not only quickly predict the specific optical rotation of fluorinated molecules,but also achieve the accuracy of quantum chemical calculation.3.Absolute configuration prediction of major products for chiral resolution of secondary alcoholsFrom the literature,we selected the 34 secondary alcohols and the enantiomeric products with their enantiomeric excess(ee)obtained by chiral resolution under the same conditions using 34 secondary alcohols as the reaction substrates.In order to predict the absolute configuration of the main products of enantiomeric products,the main product was presented by +1 and minor product was presented by-1,individually.The secondary alcohols represented by PAS descriptors were separately submitted to CPG network,Multilayer Perceptron(MLP),Multiple Linear Regression(MLR)and Random Forests(RF)to construct qualitative prediction models.The obtained results showed that the correct predictions of cross-validation of the whole data set were 97%~100%.In addition,we found that the orbital electronegativity and charge density play an important role in predicting the absolute configuration of the major products based on the variables selected by RF,M5 and Greedy. |