Research On Molecular Property Prediction And Generation Technology Based On Machine Learning

Posted on:2024-01-30

Degree:Master

Type:Thesis

Country:China

Candidate:L J Li

Full Text:PDF

GTID:2544307079459484

Subject:Computer Science and Technology

Abstract/Summary:

PDF Full Text Request

Molecular property prediction and molecule generation are key aspects of computeraided new drug discovery and development,which can be used to speed up drug development and decrease research costs.At present,most high-performance molecular property prediction models and molecular generation models are developed based on machine learning.However,existing methods are faced with several challenges and difficulties.In molecular property prediction tasks,most of the current models that perform well are implemented through deep learning.However,such models rely on large amounts of labeled data,and labeling molecular properties accurately is time-consuming and expensive.In order to improve the performance of the molecular prediction model with limited annotation cost,this thesis proposes a Pre-trained Variational Adversarial Active Learning method(PREVAIL)to screen molecules to be labeled.Unlike previous active learning methods based on a random sampling of the initial set,PREVAIL selects the most informative initial dataset by deep clustering methods,thus avoiding biases that affect the accuracy of the early decision process.In addition,PREVAIL uses task-aware variational adversarial active learning to merge the loss information from the molecular property prediction task into the latent space,which adapting both the distribution of molecules and the information from the prediction task.In molecular generation tasks,due to the complex compound structures and properties of molecules,and deep learning methods can efficiently extract complex features,so most existing molecular generation models are developed based on deep learning.However,these methods are faced with the problems of generation validity and semantic information of labels.Thus,this thesis proposes a Cross Adversarial Learning for Molecular Generation(CRAG)method,which combines the realism of variational auto-encoder based methods with the diversity of generative adversarial network based methods to further exploit the complex properties of molecules.Specifically,an adversarially regularized encoder-decoder is used in CRAG to transform molecules from simplified molecular input linear entry specification(SMILES)into discrete variables.Then,the discrete variables are trained to predict property and generate adversarial samples through projected gradient descent.In the conditional generation task for molecules,this thesis proposes a conditional generation model based on cross adversarial learning(Cross Adversarial Learning for Conditional Molecular Generation,CCRAG).In order to generate and optimize molecules with targeted properties,CCRAG extends the CRAG model with a predictor module that computes mutual information to separate the potential vectors of molecules from the property information.Both CRAG and CCRAG proposed in this thesis are trained by adversarial learning.PREVAIL,CRAG,and CCRAG proposed in this thesis have been extensively experimented on the QM9 dataset and the ZINC dataset.The experimental results have demonstrated the advancement and effectiveness of the proposed models.Therefore,the models presented in this thesis are anticipated to carry out the molecular design based on artificial intelligence in various chemical applications and promote the development of drug discovery,materials science,and related ones.

Keywords/Search Tags:

Molecular Property Prediction, Molecular Generation, Deep Clustering, Active Learning, Adversarial Learning

PDF Full Text Request

Related items

1	Molecular Property Prediction Based On Deep Learning And Multi-Dimensional Encoding Molecular Information
2	Research On Key Technologies For Drug Molecule Recognition And Property Prediction Based On Deep Learning
3	Research On A New Method Of Generating Potential Anti-HIV Active Molecules Based On Deep Learning
4	Research On Molecular Property Prediction Methods Based On BERT
5	Multi-view Molecular Property Prediction Based On Language Models
6	Research On Molecular Property Prediction Model Based On Pseudo-twin Networ
7	Stroke Risk Prediction Based On Hybrid Deep Transfer Learning And Domain Adaptation
8	Research On Molecular Properties Prediction And Molecular Generation Based On Molecular 3D Representation For Drug
9	Research On Drug Combination Property Prediction Methods Based On Deep Learning
10	Research On Nosocomial Infectious Prediction And Unbalanced Classification Based On Active Learning And Generative Adversarial Networks