Font Size: a A A

The Research Of Deep Neural Network Model For Predicting Protein Interactions Using Only Sequence Information

Posted on:2019-08-27Degree:MasterType:Thesis
Country:ChinaCandidate:H LiFull Text:PDF
GTID:2370330626452403Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Machine learning based predictions of protein-protein interaction(PPI)could provide valuable help for protein function,disease development,and large-scale treatment design.But most machine learning-based methods require a lot of feature engineering,which increases the workload and time cost of predictive tasks.Emerging deep learning technologies can automate feature engineering and have achieved great success in many areas.However,in most cases,over-fitting and generalization performance of deep learning models have not been well studied.In this paper,we present a deep neural network framework(DNN-PPI)that predicts PPI using features that are automatically learned only from protein primary sequences.Within the framework,the sequences of two interacting proteins are sequentially fed into the encoding,embedding,convolution neural network,and long short-term memory neural network layers.Then,a concatenated vector of the two outputs from the previous layer is wired as the input of the fully connected neural network.Finally,the classification is done by the sigmoid function to predict whether the protein pairs have interactions.The method proposed in this paper can not only detect the positional relationship between amino acid residues,but also detect the long-term and short-term dependence between amino acids.At the same time,compared with the traditional machine learning method,the deep learning model can automatically extract more abstract features,which is also beneficial to obtain better prediction results.Finally,when the method was applied to Pan's human protein interaction dataset,the prediction accuracy was 0.988 and the Matthews correlation coefficient was 0.976.Accuracy ranges from 0.928 to 0.979 when predicting six external data sets.These experimental results show that the method proposed in this study has obvious advantages in predicting protein interactions.
Keywords/Search Tags:Convolution neural networks, Long short-term memory neural networks, Protein-protein interaction, Model generalization
PDF Full Text Request
Related items