Font Size: a A A

The Study Of Protein Subcellular Localization Based On Convolutional-LSTM

Posted on:2019-09-07Degree:MasterType:Thesis
Country:ChinaCandidate:S S XuFull Text:PDF
GTID:2370330566998111Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
All kinds of organelles in cells can cooperate with each other to complete the life activities of cells.Subcellular localization is the main factor that determines protein,the gene product,functional annotation,and the knowledge of target signals makes it possible to design complex drugs.Therefore,the study of protein subcellular location is very important for the research of the pathogenesis of some diseases and the development of new drugs.In the early stage,biological experiments were used,such as fluorescence labeling,electron microscopy,ultracentrifugation,etc.However,these technologies have the disadvantages of long cycle and high cost.Researchers are committed to finding more efficient computing methods to solve this problem.With the rapid development of genomics and proteomics,all kinds of bioinformatics related data have increased rapidly.The use of machine learning to explore the prediction of protein subcellular location has become a hot topic in recent years.The main research methods are based on support vector machine,nearest neighbor rule and artificial neural network.The existing methods have achieved good results in the prediction of protein subcellular localization.This paper hopes to effectively excavate hidden information in protein data by designing better biometric features and machine learning models to achieve better prediction results.In this paper,two models of the convolution neural network(CNN)and the long and short term memory neural network LSTM are used to excavate the information contained in the amino acid sequence,so as to predict the subcellular location.Then,combining the advantages of both,a convolution based integrated model of Convolutional-LSTM is built.In particular,the protein data are extracted by the convolution neural network,then the features are combined and sent to the long term memory neural network for feature characterization learning,and the results of subcellular localization are obtained.Then,the experiment was carried out to explore the effects of protein on the experimental results in different spatial positions.The experimental results were obtained from the fragments of 500 at both ends of the protein.Finally,taking into account the expression of protein amino acid component information,protein state information and the physical and chemical properties of amino acids in the nearest neighbor method,the three characteristic vectors were added to guide the Convolutional-LSTM model,and the prediction of protein subcellular localization was finally completed.In order to verify the validity of the Convolutional-LSTM model,this paper chooses 10-fold cross validation method to compare with other efficient algorithms.The experimental results show that the prediction accuracy of the ConvolutionalLSTM method can reach 82%,81.7% and 96.8% on three kinds of data of plant,fungus and animal,which proves that the method is effective and efficient.
Keywords/Search Tags:protein subcellular location, convolutional neural network, long term memory neural network, classification
PDF Full Text Request
Related items