Prediction Of Drug Biological Effect Attributes Based On Multi-source Information Integration | | Posted on:2024-01-29 | Degree:Doctor | Type:Dissertation | | Country:China | Candidate:H C Zhao | Full Text:PDF | | GTID:1524307310482294 | Subject:Computer application technology | | Abstract/Summary: | PDF Full Text Request | | With the rapid progress of computer science and technology,along with the vast accumulation of multi-source big data on medicines,the exploration of drug properties through artificial intelligence algorithms has become a prominent research area in computational biology and systems biology.Leveraging big data and AI algorithms in studying the biological effects of drugs enables the systematic capture of complex patterns that are difficult for humans to uncover,leading to improved efficiency and success rates in drug development.This paper focuses on the mining of drug-related biological effect attributes by integrating multiple sources of information.It proposes a series of deep learning computing frameworks to address two key issues: drug treatment property prediction and drug side effect prediction.These frameworks offer researchers new predictive results and optimize the technical approaches to drug development.The main research work and contributions are as follows:(1)Aiming at the problems that the existing drug-Anatomical Therapeutic Chemical(ATC)classification code association prediction methods cannot simultaneously predict all levels of drug codes under the ATC classification system and lacks the ability to learn features from sparsely known drug-ATC classification code associations,this paper proposes a prediction method based on deep residual network(RNPred ATC)to predict five-level ATC classification codes of approved drugs.The model integrates the semantic similarity of ATC classification codes,the structural similarity of drugs and the known drug-ATC classification code associations to construct a drug-ATC classification code interaction network for each level of ATC classification codes respectively.By inputting the node information and edge information in the network into the deep residual network,the prediction scores of the associations between drugs and ATC classification codes can be obtained.Experimental results show that RNPred ATC achieves the best performance in each ATC classification level with an average Area Under the receiver operating characteristic Curve(AUC)of 0.953.In addition,the visual analysis and case studies of hidden layers also illustrate the rationality and reliability of RNPred ATC.(2)Based on the fact that the existing compound potential ATC classification code prediction methods do not extract rich compound information and consider the correlation between ATC classification codes,an end-to-end multi-label classifier(CGPred ATC)is proposed to predict the 14 main ATC classification codes of the given compounds.CGPred ATC uses a deep convolutional neural network and residual blocks to represent and learn multiple association scores between the given compounds and other compounds.CGPred ATC not only uses deep convolutional neural networks and residual blocks to represent and learn multiple associations between a given compound and other compounds,but also uses a graph convolutional network to model and learn the associations between ATC classification codes.The compound representation learning process is guided by the representations containing label correlation information.Experimental results indicate that CGPred ATC yields significant improvements in predictive performance compared to existing multi-label classifiers with the absolute correct rate and error rate reaching 76.2% and 2.8%.In addition,the case studies also confirm the effectiveness of CGPred ATC in practical applications.(3)Focused on the problem that the existing drug side effect frequency prediction methods do not consider the prior data of drugs and side effects,resulting in limited model generalization ability,a multi-view deep learning model(MGPred)is proposed to predict the frequencies of drug side effects.MGPred can extract view vectors from multiple data sources.For different view vectors,MGPred aggregates their heterogeneous neighborhood features via a node-level attention module to obtain drug and side effect node latent representations.Experimental results show that Root Mean Square Error(RMSE)of MGPred on the benchmark dataset is 0.651,which is about 50% higher than the prediction performance of matrix factorization-based methods.Furthermore,the results of visualization and ablation experiments on drug representations obtained from three different views demonstrate that MGPred is a powerful tool for predicting the frequencies of drug side effects.(4)Be concentrated on the problem that the existing prediction methods of the frequencies of drug side effects rely on the known frequencies of drug side effects and cannot predict the frequencies of new drug side effects,a deep learning method based on multiple similarity fusion(SDPred)is proposed to determine the drug-side effect associations and frequencies simultaneously.The model calculates multiple types of similarities through multi-source data and fuses them to decrease noise in a single similarity calculation and draw effective features,thereby providing more accurate representations for drugs and side effects.To better learn the features of drug-side effect pairs,SDPred applies an interaction module based on outer product operations and convolutional neural networks.The experimental results show that RMSE of MGPred on the benchmark dataset is 0.651,which is about 50% higher than the prediction performance of the method based on matrix factorization.The case analysis also confirms the validity of the method in practical application.(5)Aiming at the problems that existing drug side effect computational predictive methods cannot predict the serious outcomes that may be caused by drug side effects,this paper uses seven classes to quantify the severity of drug side effects,and proposes an end-to-end multitask deep learning framework(GCAP)for prediction severity of drug side effects.GCAP has two tasks,one is to predict whether the side effects of drugs are serious,and the other is to identify the specific severity classes of drug side effects.Cross-validation and independent set test evaluations show that GCAP achieves satisfactory performance on both tasks.In addition,the results of the de novo test show that when only the structure features of and semantic features of side effects are used as input,GCAP still maintains good predictive performance,which proves that it has the ability to help new drug discovery. | | Keywords/Search Tags: | drug biological effect properties, drug side effect prediction, drug repositioning, deep learning, information integration, data mining | PDF Full Text Request | Related items |
| |
|