Font Size: a A A

Deep Learning-based Prediction Of DNA-binding Proteins

Posted on:2022-01-09Degree:MasterType:Thesis
Country:ChinaCandidate:Y HanFull Text:PDF
GTID:2510306755451274Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
DNA-binding proteins,as a class of proteins that can bind to DNA and form complexes in organisms,play an important role in the process of genetics and evolution in organisms.Research and prediction of DNA-binding protein is of vital importance to the healthy development of life in medicine,agriculture and animal husbandry.With the advent of the post-genome era,a large number of methods for analysis and prediction based on sequence information have been proposed,which greatly promotes the development of the field of DNA binding protein prediction.From the perspective of sequence information,this article uses machine learning algorithms and deep learning frameworks to construct DNA-binding protein prediction models.The details are as follows:(1)We summarized the current research status of DNA binding protein prediction.First introduced the protein-related database,then introduced some general feature extraction methods in DNA binding protein feature extraction,and briefly described the process.Finally,some popular algorithms and principles are stated.(2)Using traditional machine learning algorithms,we propose a DNA binding protein prediction method based on pseudo-evolution information.From the perspective of the evolution information of the protein sequence,we extract the position-specific score matrix of the protein sequence,divide the matrix into sub-matrices,extract pseudo-local evolution characteristics,and combine the amino acid composition of the protein to form the pseudoevolution information of the protein.Use radial basis function support vector machine and bias reduction algorithm to select features and compare different classifiers,and finally generate a prediction model.The prediction accuracy and Matthew correlation coefficient of this method on the test set are 91.2% and 0.826,respectively,which are better than the existing machine learning methods.(3)Using the deep learning framework,we propose a DNA binding protein prediction method based on BLSTM.We use embedding to convert amino acid sequences into vectors.In deep learning networks with convolutional layers,pooling layers,and bidirectional long and short-term memory networks,we use self-built data sets for training,and get predictions on the test set.The accuracy of the results and the Matthew correlation coefficient are 92.0%and 0.841,respectively,which are better than the results reproduced by other methods on this data set.In addition,we constructed a data set of DNA binding proteins of different species from Uni Prot,and the exploration model is in the predicted effects between different species.The two different DNA-binding protein prediction methods proposed in this paper have shown good performance,and the proposed method can be further applied to the field of protein analysis and prediction.
Keywords/Search Tags:DNA binding protein, feature extraction, pseudo-evolutionary information, BLSTM, Deep Learning
PDF Full Text Request
Related items