Font Size: a A A

Accurate Prediction Of The Energy And Thermodynamic Properties Of The Molecule

Posted on:2006-12-18Degree:DoctorType:Dissertation
Country:ChinaCandidate:X M DuanFull Text:PDF
GTID:1101360155960467Subject:Physical chemistry
Abstract/Summary:PDF Full Text Request
Quantum chemistry, as a fundamental subject studying properties and interactions of molecules as well as their micro-mechanism, has been developed remarkably on its primary theories and methods in the past decades. It shows a great success in interpreting and predicting the properties of middle, small-sized molecules, such as electronic structure, heats of formation, active energy of reaction, NMR spectrum, and even functional materials optimization and design, bio-system modeling, drug design, drug screening, and so on. One of the Holy Grails of quantum mechanical calculation is to predict properties of matter prior to experiments, to examine the physical properties or processes that are inaccessible by experiments.Despite their success, the results of first-principles quantum mechanical calculation contain inherent numerical errors caused by various intrinsic approximations, in particular for complex systems. The origin of the calculation errors mainly comes from the electron correlation and basis set. Because of computational cost, electron correlation has always been a major stumbling block for first-principles calculations. Many methods and theories for relatively accurate evaluation of electron correlation have been proposed in recent years, such as configuration interaction (CI), coupled-cluster theory (CC), Moller-Plesset perturbation theory (MP), Gaussion-1, Gaussian-2, Gaussian-3, and complete basis set methods. These procedures are most computational resource consuming and are still inapplicable to complex systems. Density-functional theory (DFT) offers promising alternatives for tackling the electron correlation. Now, the best DFT methods have the similar precision with the MP2 methods, while the computational cost is just comparable to HF method. However, the errors of DFT calculations are accumulated with the sized of the molecule. Thus, a balance has to be found between accuracy and efficiency.In this thesis, approaches combined first-principles calculation and linear regression or neural network are proposed to correct the systematic errors of the calculated energy and heats of formation of molecules at low theory level. At the same time, we have compared the corrected results from the linear regression correction approach and the neural network correction approach, and systematically investigated the computational methods, basis setand the physical descriptors. With general descriptors, these combined methods can greatly eliminate the systemic errors of theoretical calculation due to ignoring the electron correlation and using small basis set, and will be a novel tool for predicting the properties of the molecules.Because of the interaction of the electrons is the main origin of the electron correlation energy, we select the numbers of bonding electrons, lone-pair electrons and inner layer electrons in molecules, and the number of unpaired electrons in the composing atoms in their ground states as physical descriptors, employing the linear regression correction method (LRC) to correct the systematic errors of the calculated heats of formation of 180 middle, small-sized organic molecules. The theoretical calculation errors from the experiments are eliminated greatly, especially for HF method, the root mean square (RMS) deviations are decreased by more than 60 times. For the HF/6-31G(d) and HF/6-311 +G(d,p) methods, the RMS deviations of heats of formation of 180 molecules are reduced from 392.0 and 395.0 kcal/mol to 6.2 and 6.1 kcal/mol, respectively, and for the B3LYP/6-31G(d) and B3LYP/6-311+G(d,p) methods, the RMS deviations of heats of formation are decreased from 10.8 and 20.9 kcal/mol to 3.4 and 3.0 kcal/mol, respectively. Most importantly, the deviations of large molecules are of the same magnitude as those of small molecules after linear regression correction for both HF and DFT methods, which proves that our linear regression correction method does not discriminate against the large molecules and can potentially be applied to much larger systems. The coefficients of partial correlation Vj are calculated to assess the validation of physical descriptors. It is found that the bonding electrons, inner layer electrons and unpaired electrons in the composing atoms are very important for correcting the systematic errors of heats of formation. At the same time, we test the relative contribution of an individual physical descriptor by leaving out one descriptor and examining the increase of the RMS deviations. The results are consistent with the analysis of the coefficients of partial correlation. Our linear regression correction approach has accounted for the most errors caused by ignoring electron correlation energy and using small basis set, and the physical descriptors are not limited to the specific properties of the molecules, and thus this combined method is feasible for accurate prediction of the properties of the molecules.Since the physical descriptors taken from the electron pairs with different chemical environment might result in non-continuous potential energy surface and the training set are only closed-shell organic molecules, we improve on the linear regression correction method from the following two terms, (1) The electron populations of different types of natural bond orbital (NBO) are used as the physical descriptors including 2-center bonds (BD), 1-center core pair (CR), 1-center valence lone pair (LP), 1-center Rydberg (RY*), 2-center anti-bond (BD*), and valence non-Lewis lone pair (LP*), and the number of the unpaired electrons of the composing atom in its ground state is also included as a descriptor. (2) The training set is enlarged to contain 350 heats of formation of small and medium-sized organic, inorganic molecules and radicals. The heats of formation calculated by the HF/6-31 G(d), HF/6-31 \G(2d,d,p), B3LYP/6-3\G(d), B3LYP/6-31 \+G(d,p), B3LYP/6-31 \G(2d,d,p) and B3LYP/6-31 \+G(3df,2p) methods are corrected by the linear regression correction method. The RMS deviations of 350 heats of formation are reduced from 327.1, 330.6, 11.2, 19.6, 15.3, 6.7 kcal/mol to 10.2, 9.6, 4.5, 5.6, 4.0, 3.2 kcal/mol upon the linear regression correction. We also calculate the partial correlation coefficients for assessing the relative importance of the physical descriptors. It is found that the bonding electrons, inner layer electrons and unpaired electrons, but also the low-occupied electrons, have significant effect on the correction of the systematic errors. We have compared the correction results by using different descriptors: NBO descriptors and electron pair descriptor, and found that the correction results with NBO descriptors are better. Reaction barrier heights can also be corrected by using the current LRC approach. Employed the correction coefficients obtained from the B3LYP/6-31 \G(2d,d,p)- linear regression correction method, the 12 barrier heights of 6 reactions are corrected. The mean absolute deviation of the 12 barrier heights is reduced from 5.3 kcal/mol to 2.9 kcal/mol after the linear regression correction, and almost to the same accuracy as that for the heats of formation.On the other hand, The neural network (NN) method employing the NBO descriptors is used to correct the heats of formation calculated by the HF/6-3 lG(d), HF/6-311 G(2d,d,p), B3LYP/6-31G(...
Keywords/Search Tags:Linear regression correction approach, neural network correction approach, Heats of formation, coefficient of partial correlation, cross-validation
PDF Full Text Request
Related items