Research And Application Of Protein Classification Based On Deep Learning

Posted on:2019-07-05

Degree:Master

Type:Thesis

Country:China

Candidate:L F Shao

Full Text:PDF

GTID:2310330563953936

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

The antioxidant protein can repair DNA damage of human beings,thus it plays an important role in the treatment of cancer.So the classification of protein sequences is crucial for the prediction of oxidation in pharmacology.Since the implementation of the human genome project,protein classification problem has become an important branch of proteomics.Biological data increased exponentially every year,thus the identification of protein sequences by biochemistry experiment is very time-consuming.the development of new biological information on computer algorithm is efficient and has reliable means,besides learn to study protein classification problem,and predict the structure and function has more important and practical significance.As a new technology based on database,statistics and AI,data mining provides an unprecedented data analysis tool for biologists,and provides a powerful means for protein information analysis and extraction.This paper mainly introduces the application of deep learning method in data mining based on protein sequence classification.The main contents are as follows:1.The method of feature extraction based on the first order of protein is introduced.Protein sequence contains enough information to predict biological,physical and chemical properties of protein molecules,and the features extracted from them determine the best performance of subsequent classifiers.In this paper,we use the two peptide composition widely used in biology to describe protein sequence information.This feature extraction method does not need any other information,and has the advantages of simple and fast computation.It plays a decisive role in the performance of subsequent classifiers.2.A protein sequence classification model based on deep learning is proposed.Compared with traditional machine learning methods which rely on artificial engineering structural feature extractor,deep learning is a feature learning method,the original data through some simple but nonlinear transformation model become the abstract representation for classification.The first part of this model is feature learning composed of encoder and a fully connected network learning network.It learns abstract feature compression from the original feature vector,and then use the t-SNE method to do dimensionality reduction to transfer the learned feature to two-dimensional space,finally put the data into the SVM classifier to identify protein sequence.Experiments show that the model has high recognition effect of antioxidant protein.In the experimental data,the F1 value is 0.8842,MCC value is 0.7409,the accuracy rate is 97.05%,and the recall rate is 81.27%,which is superior to the traditional machine learning method.3.Based on the model proposed in this paper,an online antioxidant protein identification web service is developed.The service has the function of online predicting whether the amino acid sequence submitted by users is an antioxidant protein.In addition,it also provides data set download for this article,which is convenient for users to use and research.

Keywords/Search Tags:

antioxidant protein classification, g-gap dipeptide composition, deep learning, autoencoder

PDF Full Text Request

Related items

1	Network Representation Learning Based On Improved Louvain Algorithm And Deep Autoencoder
2	Antioxidant Protein Identification Based On Support Vector Machine
3	Research On Single-cell RNA-seq Data Mining Based On Deep Learning
4	The Application Of Feature Extraction And Classification Algorithm In Predict Membrane Protein Classification Problem
5	Research On Protein Sequence Classification Based On Deep Learning
6	Research On Protein Ubiquitin Classification Algorithm Using Deep Learning
7	Characterization Of Shale Gas Layer Reflection Seismic Signals Based On Sparse Autoencoder
8	Anomaly Detection In Attributed Networks Based On Deep Autoencoder
9	Research On Intrusion Prevention Technology Based On Three-Way Decisions And Deep Learning
10	Research On Protein Complexes Detection Based On Deep Autoencoder