Font Size: a A A

A Series Of Topological Indices And Their Applications To Uniqueness Characterization Of Compounds

Posted on:2019-01-07Degree:MasterType:Thesis
Country:ChinaCandidate:K X XiaoFull Text:PDF
GTID:2371330548964363Subject:Analytical Chemistry
Abstract/Summary:PDF Full Text Request
Topological indices are mainly applied to two aspects: studies on quantitative structure-activity/property relationship(QSAR/QSPR)and processing of chemical information administration.The highly discriminating power of the topological index was required for chemical information administration,that is,the uniqueness of the topological index should be tested using a large number of compounds.Based on the virtual data sets generated by our previous jobs(such as structure generator),in this paper the researches on generation,uniqueness test and applications of the topological indices had been carried out.The details are as follows:1.Studies on highly selective molecular topological indexSometimes a compound can be represented by several different chemical graphs,and these graphs are equivalent chemical structures for chemists,they mostly expect that these structures can be represented by the same descriptors.To solve this problem,in this thesis a highly selective molecular topological index – ATID(adjacent topology identification)derived from 3-EAID was suggested.Firstly,the uniqueness test of ATID was performed by two large virtual data sets,no degeneracy,that is,ATID possesses highly discriminating ability.These two virtual data sets are: 1)over 60 million acyclic alkane isomers with 1-25 carbon atoms;2)more than 19 million benzenoids composed of 1-14 benzene rings.Then,the ATID was successfully applied to retrieval of duplicate structures in real databases.The results indicated that there were 13 pairs of equivalent chemical structures in more than 200 thousand compounds of NCI database and more than 80,000 pairs of equivalent chemical structures in more than 20 million structures of ZINC database.These equivalent chemical structures were mainly formed by some conjugated systems or distributions difference of charges.2.Studies on highly selective atomic topological indexThe highly selective atomic topological index can be used for the identification of equivalent atoms,and is one of the basic tasks for the development of unique descriptors of chiral compounds.In this thesis the atomic topological index—aEAID on the basis of molecular EAID was suggested,and its uniqueness was tested by a virtual data set of 3,851,864 atoms derived from acyclic alkanes(2-19 carbonatoms),no degeneracy occurred.While the uniqueness test was performed by the 3,814,521 atoms derived from NCI database,12 pairs of degenerated atoms within 7 pairs of molecules were found.In order to improve the discriminating ability of aEAID,a distance factor was introduced into aEAID to generate d-aEAID.The two atomic data sets above were also used to test the uniqueness of d-aEAID separately,and there was no degeneration.Furthermore,the two atomic indices were applied to automatic identification of chiral centers for more than 100,000 chiral compounds(each compound possesses 1–38 chiral centers),and both atomic indices correctly identified all the chiral centers.The method of identifying chiral centers using d-aEAID index has been incorporated into a chiral index program in our laboratory,and used for the studies on structure-activity relationships of chiral compounds.3.The fast prediction of HOMO/LUMO using machine learning methodsIn addition,the ATID index was applied to the identificaiton of the duplicated structures in a molecular data set related to frontier orbital energies,and the data set consisted of 111,725 molecules was obtained.Based on molecular descriptors,machine learning algorithms were used to predict the HOMO and LUMO.The results indicated that the mean absolute error(MAE)in the random forest model were up to 0.15 and 0.16 eV for the HOMO and LUMO,respectively.If the combination of orbital energy calculated by PM7 and molecular descriptors were used to build random forest models,the obtained results were obviously better than those merely calculated by PM7(reduced MAE in >30%).
Keywords/Search Tags:topological indices, equivalent chemical structures, automatic identification of chiral centers, HOMO and LUMO, uniqueness
PDF Full Text Request
Related items