Font Size: a A A

Research Of Virus-host Protein-protein Interactions Prediction Based On Densely Connected Convolutional Networks

Posted on:2024-01-29Degree:MasterType:Thesis
Country:ChinaCandidate:W J WangFull Text:PDF
GTID:2530307085487334Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Viruses branch widely and are highly infectious and deadly to humans.For example,during the outbreak of novel coronaviruses in 2019,the World Health Organization(WHO)announced 6.5 million cumulative new coronavirus deaths worldwide.To protect humans from all types of viruses,the study of protein-protein interactions between viruses and hosts has become increasingly important.Traditional experimental methods of PPI assay are time and effort consuming,while constructing models for prediction by computer techniques not only excludes protein pairs with low interaction probabilities but also complements experimental techniques and limits the range of PPI candidates.First,the extraction of protein sequence features using convolution in deep learning focuses only on local key information and ignores the remote dependencies of protein sequences,which leads to the deviation of prediction results from the true values.To address this problem by constructing a densely connected convolutional network to extract local key features among amino acid residues and using a self-attention mechanism module to extract remote dependence features among amino acid residues.Adding iterative splicing operations between the convolutional layers can solve problems such as gradient explosion and getting into local optimums.The proposed self-attentive module receives inputs from two aspects:(I)intermediate sequence features extracted by previous dense blocks;(II)adding position encoding to inject information about the absolute or relative positions of amino acid residues to achieve the extraction of remote dependencies of sequences.Secondly,the feature representation method based only on the physical and chemical properties of amino acids,or the protein sequence feature representation method is relatively simple,and other feature representation methods are still needed to supplement it.In this paper,we propose to classify amino acids based on their hydrophobic and electrostatic interactions.One-Hot encodes amino acid classification information,and Word2 Vec is used to map each amino acid to a dense vector,in which protein sequence features can be represented by the above two amino acid vectors.In addition,protein evolution information(RPM-PSSM)is integrated.RPM-PSSM is implemented by multi-layer fully connected neural network for feature learning and dimension compression.Finally,the obtained sequence features and evolutionary information features are input into the classifier after multi-modal fusion to predict the probability of interaction between virus and host protein pairs.The fusion of multimodal information not only increases effective information but also improves the accuracy of prediction results.To validate the predictive performance of the DENCnet model,comparison experiments,ablation experiments,and proof of model generalization ability were conducted.The results show that the proposed model can predict the probability of protein-protein interaction between virus and host more accurately,and the prediction performance still maintains about 90% accuracy in the face of the emergence of new viruses.
Keywords/Search Tags:interaction prediction, deep learning, self-attentive mechanism, protein, virus
PDF Full Text Request
Related items