Font Size: a A A

Research On Ensemble Learning Method For Drug-Target Interaction Prediction

Posted on:2021-03-24Degree:MasterType:Thesis
Country:ChinaCandidate:J YangFull Text:PDF
GTID:2504306017499564Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Identifying drug-target interactions(DTIs)is an important issue in drug development.Traditional drug-target interaction discovery methods based on biological experiments are time-consuming and labor-intensive,requiring computation-based methods to speed up and reduce overhead and increase screening speed.In this thesis,the problem of drug-target interaction prediction is defined as a binary classification problem.Ensemble learning has the advantages of strong robustness,high prediction performance,and better ability to handle high-dimensional and imbalanced data sets.Therefore,research on drug-target interaction prediction is conducted with ensemble learning from the following aspects:1.The downsampling method is used to construct a class-balanced data set.Afterwards,the random subspace and dimension reduction methods are combined to map the drug target data to multiple different hidden spaces,which are used to train multiple differentiated neural networks.Finally,these neural networks are combined for prediction.2.A stacking framework is proposed to solve the problem of drug target data class imbalance.It firstly uses sampling without replacement to obtain multiple completely different negative sample sets,then combines the positive sample set with each negative sample set to train multiple weak learners,and finally uses a meta-learner to combine the output of these weak learners to give the prediction results.3.Aiming at the problem of incremental drug target data,the continual learning method of ensemble model is studied.A model is used to learn the continuously generated data set,learning new knowledge while retaining the knowledge that has been learned,so as to achieve the purpose of continuous expansion of the classification model.The proposed methods are validated using gene expression profiles from multiple cell lines.The experimental results show that:the ensemble neural network model proposed in this paper can effectively reduce the risk of overfitting and improve the classification performance;the proposed stacking framework alleviates the imbalance of drug target data class and further improves the classification performance;the proposed continual learning method of ensemble model can effectively use one model to learn multiple data sets.These research efforts have a positive effect on drug development.
Keywords/Search Tags:drug-target interaction prediction, ensemble learning, continual learning, classification method
PDF Full Text Request
Related items