Font Size: a A A

A Data-driven Research On Molecular Generation

Posted on:2022-02-11Degree:MasterType:Thesis
Country:ChinaCandidate:J F ZhuFull Text:PDF
GTID:2481306323978669Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Generating molecules with desired properties is an important task for chemistry and pharmacy.In history,many important compounds were discovered by accident.People have begun to explore methods to generate molecules with desired chemical properties With the development of technology.However,the traditional molecular generation paradigm requires a lot of expert knowledge.This paper takes data-driven research on molecular generation in order to alleviate the above problems.By studying the characteristics of molecular space,we can design suitable algorithms to overcome the difficulties of molecular generation.The specific work and innovations are as fol-lows:First,this paper proposes a genetic algorithm suitable for space exploration of large molecules.It is impossible to cover the whole space using traditional traversal methods because of large molecular space.At the same time,the molecular space is not smooth,therefore small structural changes will cause large fluctuations in property values.How to overcome the difficulties in molecular generation is an important task.In this paper,combined with prior knowledge of chemistry,a widely adaptable optimization algorithm is designed based on the characteristics of molecular space.This paper conducts a large number of quantitative and qualitative experiments on the proposed algorithm.The z test on the experimental results shows that the algorithm proposed in this paper has a probability of more than 99%to be significant.While carrying out a large number of experiments,this article also pointed out the defect in the evaluation standards of the previous work.This defect may affect the objective evaluation of the work in the whole direction.Then,this article explores further based on the above genetic algorithm.The method proposed above can fully and effectively explore the molecular space without the limit from dataset.However,the generated molecules often have extremely com-plex structures,resulting in a decrease in their chemical stability.The molecules in the dataset often do not have this disadvantage.At the same time,the above algorithm is easy to fall into a local optimal solution.In order to achieve a balance between explor-ing molecular space and molecular stability,and at the same time increase the diversity of algorithm solutions,this paper combines the above-mentioned methods with neural networks.The above technology makes this method more useful in practice.Finally,this article summarizes the research work and provides future development directions.
Keywords/Search Tags:data mining, molecular generation, genetic algorithm, drug discovery, artifcial intelligence
PDF Full Text Request
Related items