Font Size: a A A

Molecular Graph Generation Based On Deep Learning

Posted on:2022-09-02Degree:MasterType:Thesis
Country:ChinaCandidate:B Y JiangFull Text:PDF
GTID:2480306740496384Subject:Signal and Information Processing
Abstract/Summary:PDF Full Text Request
In modern society,the exploration of new molecules in materials chemistry has become a hot topic,and material innovation is a key driver of many recent technological advances.Chemistry and materials science researches are constantly evolving to develop compounds with novel uses,lower costs and better properties.This paper focuses on molecular graph generation based on deep learning to study the tasks of molecular generation and target optimization.The main work of this paper is as follows:Firstly,I study two types of molecular representation,text representation and graph representation and compare and discuss their mechanisms,characteristics,advantages and disadvantages.The QM9 data set used in this paper is introduced,and the evaluation indexes considered are explained from the perspective of molecular generation and target property optimization.Secondly,I study the typical molecular graph generation model,which can generate any intention sequentially and capture its structure and properties.The experimental results show that the model can generate high quality synthetic graph and real molecules unconditionally or based on data after training.This model generally performs better than baselines that are not represented by graphs,showing great potential and unique advantages.Then,I study Reinforcement Learning(RL)and Generative Adversarial Network(GAN)and propose a molecular graph generation model MGAN based on RL and GAN.This model use the GAN based on Wasserstein distance to directly operate graph structure data.Using Wasserstein distance instead of Jensen-Shannon divergence to measure the distance between the real sample and the generated sample distribution is a more stable GAN model that minimizes divergence.This approach is combined with Reinforcement Learning objectives to encourage the generation of molecules with specific desired chemical properties.Combined with minibatch discrimination,the problem of mode collapse of GAN is alleviated,and the stability of the model is greatly improved.In the experiments of molecular generation,it performs well in the efficiency and novelty rate,which can reach 99.8%and 93 % respectively.Only the uniqueness rate is relatively low,which is only 19.2 %.At the same time,MGAN can learn the distribution of the original data set,and the molecules it generates can basically match the distribution of QM9 in terms of molecular weight and solubility,and it tends to be distributed centrally.In the molecular optimization task,the optimization performance was improved by 4.2% while the efficiency was maintained by 100 %.Finally,I study the Variational Autoencoder and propose a molecular graph generation model MVAE based on VAE.In this model,the Gated Graph Neural Network(GGNN)is built into the VAE encoder and decoder,but its running time is long and it occupies a large amount of memory,while the Message Passing Neural Network(MPNN)performs excellent on the molecular property prediction benchmark,so I consider replacing the original GGNN with MPNN.By structuring the VAE potential space to allow optimization of molecular properties.In the experiments of molecular generation,the model can generate 100 % effective compounds with high novelty and uniqueness rates of 98.1%and 98.6%,respectively.In the molecular optimization task,the target characteristic QED is further optimized compared to other baselines,the optimization performance is improved by 5.8% while the efficiency is still 100 %.
Keywords/Search Tags:Deep Learning, Molecular Generation, Reinforcement Learning, Generative Adversarial Network, Variational Autoencoder
PDF Full Text Request
Related items