Font Size: a A A

The Study Of Computational Drug Discovery Based On Deep Learning

Posted on:2024-05-27Degree:DoctorType:Dissertation
Country:ChinaCandidate:L ZhangFull Text:PDF
GTID:1524307319491904Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Drug discovery refers to the process of researching,developing,and introducing new therapeutic drugs or vaccines to address human diseases and health problems.In recent years,with the advancements in deep learning technology and its practical applications in drug discovery,the efficiency and success rate of drug discovery have significantly improved,and the time and cost associated with drug discovery have gradually decreased.The deep learning-based drug discovery paradigm has become an important complement to traditional drug discovery methods.Regarding drug discovery,this article conducted research on three significant issues in the areas of target identification and exploration of drug-like compounds: the prediction of non-coding miRNA biomarkers,drug-target binding affinity and PROTAC-targeted-degradation capacity.Among these,the prediction of non-coding miRNA biomarkers is one of the critical issues in the field of target identification.The prediction of drug-target binding affinity and PROTAC-target-degradation capability are two equally important subbranches in the exploration of drug-like compounds.The exploration of drug-like compounds is the downstream area of target identification.In the research of non-coding miRNA biomarker prediction,this article proposed a prediction model called VAEMDA based on an unsupervised deep learning framework related to the variational autoencoder.To address the issue of some models being unable to predict potentially relevant miRNAs for new diseases and the lack of reliable negative samples in this field,this article utilized the variational autoencoder to develop VAEMDA.To evaluate the performance of VAEMDA,this article conducted global leave-one-out cross-validation(LOOCV),local LOOCV,five-fold crossvalidation,and case studies.The results show that VAEMDA has advantages of high prediction accuracy and robustness.The advancements of VAEMDA include: 1)VAEMDA does not require negative samples;2)VAEMDA’s core is a generative model,which has advantages in handling missing data;3)VAEMDA’s latent dimensions can capture complex relationships between diseases and miRNAs;4)The process of the variational autoencoder reconstructing original data continually reduces noise,while the process of calculating the Kullback-Leibler loss continuously generates gaussian noise,creating an adversarial relationship to avoid overfitting issues.In the research of drug-target binding affinity prediction,this article released two prediction models,MRBDTA based on the multi-head attention mechanism and skip connection strategy,and GPCNDTA based on edge-weighted graph encoders and crossattention networks.To further improve the accuracy,robustness and generalization of models,and to address the lack of biological interpretability and application cases in deep learning models,this article developed MRBDTA based on embedding layers,positional encoding,transformer encoders,skip connection strategy,and feed-forward networks.To evaluate the performance of MRBDTA,this article conducted testing experiments,module analysis experiments,biological interpretability analysis,and case studies.The results show that MRBDTA has high prediction accuracy and strong stability.Module analysis experiments demonstrate the effectiveness of the Trans module and skip connection strategy.Biological interpretability analysis shows that MRBDTA can accurately capture some interaction sites between proteins and drugs.Case studies demonstrate the application potential of MRBDTA in the field of antiCOVID drugs.The innovations of MRBDTA include: 1)the development of a novel molecular sequence encoder,the Trans module;2)the introduction of skip connection strategy at the encoder level;3)MRBDTA has the ability to capture interaction sites between drugs and targets.To address the issue of extracting edge information from molecular graphs,obtaining knowledge about pharmacophores,and integrating multimodal data of the same biological molecule and effectively capturing interactions between two different biological molecules,this article developed GPCNDTA.To evaluate the performance of GPCNDTA,this article conducted testing experiments,ablation experiments,and case studies.The test results show that GPCNDTA outperforms the state-of-the-art models in terms of prediction performance and reliability.A case study based on 3C-like proteases demonstrates the application potential of GPCNDTA,and a pharmacophore-based case study further reflects the reliability of GPCNDTA.The innovations of GPCNDTA include: 1)the design of a novel graph encoder that can fully extract edge information from molecular graphs;2)pharmacophores as the novel prior knowledge to improve prediction accuracy;3)the design of intra-and inter-molecular cross-attention.In the research of PROTAC-target-degradation capability prediction,this article introduced a prediction model called Ai PROTACs based on graph contrastive learning.Existing machine learning and general neural networks have unreliable performance in this area,and specialized models for this domain are lacking,while labeled data for training models in this domain is scarce.Given these issues,this article developed Ai PROTACs based on a self-built dataset using the graph contrastive learning mechanism,graph augmentation strategies,novel graph encoders,and modal interaction strategies.Ai PROTACs consists of three parts: feature encoding,graph contrastive learning,and supervised learning.To evaluate the performance of Ai PROTACs,this article conducted testing experiments and case studies.The test results show that Ai PROTACs outperforms baseline methods in various aspects.Case studies reflect the high sensitivity and reliability of Ai PROTACs.The contributions of this study include: 1)preprocessing data from PROTACs-DB 2.0 to obtain samples that can be directly used for model training;2)constructing a novel dataset for model evaluation;3)introducing contrastive learning to leverage unlabeled samples;4)designing novel graph encoders and modal interaction strategies.
Keywords/Search Tags:miRNA, small molecule drug, protein target, PROTAC, deep learning, prediction model
PDF Full Text Request
Related items