Font Size: a A A

Statistical Analysis Of Photochemical Characteristics For New Four Bases Of DNA

Posted on:2020-06-18Degree:MasterType:Thesis
Country:ChinaCandidate:J L XingFull Text:PDF
GTID:2417330578468987Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
Cytosine methylation of deoxyribonucleic acid(DNA)is a major epigenetic modification,which plays an important role in gene regulation,genomic stability,transcription and the development of many human cancers and diseases.In recent years,5-methyl cytosine(5-methyl substituted for cytosine)and its sequential oxidation products 5-hydroxymethyl cytosine(5 hmC),5-formyl cytosine(5 fC)and 5-carboxyl cytosine(5 caC)have been considered as the "new" four bases of DNA and become a new research hotspot.In this paper,the inactivation paths of new four bases of DNA are analyzed by using the data generated in the traditional quantum chemical calculation.First,the ground state,excited state,ring distortion,molecular isomerization,intersystem crossing,N-H bond dissociation and H-transfer coincal intersection of "new" four bases of DNA in neutral and acidic conditions are optimized by density functional theory and fully active space self-consistent field method.The optimal inactivation path of each structure is obtained by comparing their deactivation barriers with linear interpolation method.For the first time,principal component analysis(PCA)and neural network prediction are applied to investigate the photochemistry of each structure,aiming at obtaining more information from the data rather than confined to the changes of surface structure.Compared with the ground state structures,the excitation state and the structure of the intersection point will have changes in bond length and dihedral angle.In order to further explore the main factors affecting energy,the principal component analysis method is chosen to explore the main influencing factors,and the results are compared with the existing results.The results show that,compared with the traditional quantum chemical methods,this method not only determines the main factors of the structure,but also explains the reasons for the structural changes,except for the molecular isomeric crosspoints of 5-carboxycytosine.In view of the time-consuming of quantum chemical calculation and the difficulty of convergence of individual structures,a neural network prediction method is selected to predict energy.Using the structure and energy generated in the iteration process,we can predict the single point of the model.Here,we choose the three-layer neural network with 5 hidden nodes,70%of the data as training set,and the rest as test set.The average relative error is used to evaluate the model.The prediction errors are all below 0.1%,which is regarded as a acceptable result.At the same time,we also explore the reasons for the unsatisfactory effect of individual models and the data.More outliers and less data are two important reasons.
Keywords/Search Tags:Principal Component Analysis, Neural Network, New Four Bases of DNA, Density Functional Theory
PDF Full Text Request
Related items