| Up to now,there are still many limitations in human understanding of eukaryotic gene expression.Although many protein variations are statistically related to human diseases,the exact mechanism of almost all of these mutations is unclear.In the life activity of protein is the main performer,from the genetic material to the regulation of gene expression,from the transmission of cell signaling to cell metabolism,cell growth and reproduction to necrosis of organisms,of which proteins play a very important role.Although some proteins function mainly as monomers,a large proportion of them bind to other ligand molecules or participate in the life activities of cells as biological complexes.Many of the key functions and processes in biological processes are largely controlled by different types of protein-protein interactions(PPIs),many of which are linked to their dysregulation.Therefore,the prediction of PPIs is crucial for understanding biological processes and various experimental methods for identifying new PPIs.Although previous methods of protein exploration are making continuous progress,they have a common shortcoming.The model is obtained by extracting the feature data from the observed data through powerful domain knowledge.The deviation of the domain knowledge and the possible noise may make the subsequent classifier learn the wrong knowledge.Deep learning is a more popular method,which works by blending multi-layer abstractions by mapping observations into a high-level abstraction space,building predictive models in that space.This new approach offers many attractive solutions for integrating heterogeneous data and effectively learning complex patterns automatically from multiple simple raw inputs.In this thesis,we propose a method called DeepPPI that uses deep neural networks to efficiently extract common protein descriptors(such as amino acid composition,dipeptide composition,CTD features,QSOD descriptors,APAAC composition,etc.)to learn the characterization of the protein so that the interaction between the two proteins can be better resolved.This network architecture automatically extracts features from the sequence features of proteins and learns the implicit rules that exist inside them,and the original input features of this framework fusion protein can better learn the high-level features.The experimental results show that DeepPPI achieves excellent performance on the test dataset.The accuracy index of the DeepPPI achieves 92.50%,the precision index achieves 94.38%,and the recall index reaches 90.56%Specificity reached 94.49%,Matthews Correlation Coefficient 85.08%,and Area Under the Curve 97.43%.Numerous experiments show that DeepPPI can learn the useful properties of protein pairs through hierarchical extraction.Compared with the existing methods,this method has more advantages. |