Font Size: a A A

The Research On Drug Repositioning Method Based On Multi-source Data Fusion

Posted on:2024-09-26Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z X LiuFull Text:PDF
GTID:1524307145970369Subject:Bioinformatics
Abstract/Summary:PDF Full Text Request
The development of a novel drug is often costly,and takes more than a decade with a low success rate.High costs have pushed novel drugs beyond the reach of many people and forced drug companies to focus only on diseases with high market returns and ignore diseases with low returns,such as orphan diseases.Therefore,reducing the time and money costs of drug development is of great significance for the health of all humanity.Computational drug repositioning methods can quickly and high-throughput identify potential drugtarget interactions,drug-disease associations,and other information that is important for drug research and development,thus attracting increasing attention from researchers.In this paper,we integrate heterogeneous data from different sources based on network embedding method,and explore how to model the network features to obtain accurate vectors representing drugs,targets and diseases,so as to improve the accuracy of drug retargeting.We study several issues in the field of drug repositioning and propose four different prediction methods.The main content and innovations are as follows:(1)A method for drug-target interaction prediction based on graph autoencoder.Multi-layer graph neural networks are prone to over-smoothing and,in addition,the number of known drug-target interactions is insufficient.To address the above limitations,we propose a drug-target interaction method named GADTI which combines graph autoencoder and random walk with restart.The encoder is composed of a graph convolutional network followed by a random walk with restart,which produces embeddings for nodes.Graph convolutional network is used for aggregating the first-order neighborhood information of nodes,and random walk with restart is applied for diffusing the influence of the nodes to the entire heterogeneous network.The decoder reconstructs the original network from the vector of nodes and predict new DTIs.GADTI can incorporate global features into the generation of node representations,while avoiding the over-smoothing and computational complexity caused by multi-layer convolutional networks.Compared with the baseline methods,GADTI achieved better performance.The ablation experiments and case study have confirmed the effectiveness of our proposed method.(2)A drug-target interaction prediction method combined with selfsupervised learning.We propose the SSLDTI method for predicting drug target interactions,in response to the existing methods for predicting drug target interactions that cannot fully mine network features and the imbalanced class of drug target interaction datasets.It uses self-supervised learning to model the structural and topological features hidden in the network.SSLDTI firsts calculate the mutual influence matrix between nodes by random walk with restart.In pretext task,the mutual influence between nodes serves as self-supervised information,enabling the learned node embedding vectors to fully incorporate local and global features of multi-source heterogeneous network.At the same time,we also add a positive sample loss compensation coefficient to the objective function to amplify the loss of misclassified positive samples based on the proportion of positive and negative samples,so as to reduce the impact of sample imbalance.Experiments have shown that the performance of SSLDTI is better than the state of the arts methods.Among the top 100 new DTIs predicted by SSLDTI,98 were verified by the latest version of Drugbank.(3)A drug-target binding affinity prediction method based on selfsupervised pretraining and multi-level information fusion.Existing works usually apply convolutional neural network to extract features from drug structures and target sequences,but limited by the size of convolution kernel,convolutional neural network is difficult to capture multiple different levels of dependency,and it is difficult to capture molecular topology and local context information simultaneously.To this end,we propose SMI-DTA.The point is to use graph neural network and pretraining model for selfsupervised pretraining on large-scale unlabeled drug and protein datasets,which can capture the dependence of different distances.SMI-DTA learns the representation vectors of drugs and targets from both sequence and graph levels to obtain models that accurately encode drugs and proteins.The graphic form of target is the three-dimensional structure predicted by Alphafold.Next,SMI-DTA applies the pre-trained model to the binding affinity prediction through transfer learning,so that the accurate representation of drugs and targets can be obtained even in the case of insufficient data samples.Finally,the known drug-target binding affinities serve as self-supervised information to predict the binding affinities of the unknown drug-target pairs.Experiments have shown that SMI-DTA has higher accuracy than baseline methods.(4)A multi-view integrated drug-disease association prediction method.In response to the problem that existing drug disease association prediction methods usually only work on a single heterogeneous network and capture insufficient association information,we propose a multi-view integrated drugdisease association prediction method,MVDDA.It first learns from strings of drugs and multiple association networks of drugs and diseases through selfsupervised pretraining.Each drug and disease will obtain 4 and 3 representation vectors,respectively.Next,the multiple drug vectors and disease vectors are integrated into a single drug vector and disease vector by using a Transformer.Finally,the two vectors are concatenated and the drug-disease association prediction is performed by neural network.Compared with previous studies such as deep DR,the prediction accuracy has significantly improved.The above four methods are logically closely related.Method(2)and method(1)study the same problem from different perspectives.The researchobject of method(3)is upgraded from ‘whether drug-target pair interacts’ to‘how affinity drug-target pair binds’,with the goal of predicting the bindingaffinity of drug-target pair.Method(4)directly predicts potential drugdiseaseassociations and provides useful information for drug repositioning fromanother perspective.
Keywords/Search Tags:Drug Repositioning, Drug-target Interaction Prediction, Drug-target Binding Affinity Prediction, Drug-disease Association Prediction, Network Embedding, Multi-source Data, Heterogeneous Network
PDF Full Text Request
Related items