Font Size: a A A

Protein Secondary Structure Prediction

Posted on:2021-02-08Degree:MasterType:Thesis
Country:ChinaCandidate:S P ZhuFull Text:PDF
GTID:2370330602997170Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Protein secondary structure is the basis of studying protein folding and serpentine structure.The state of protein serpentine and folding determines the activity of human biological protease.It will directly lead to diseases if this bio-activity is absent or reduced.Therefore,determining protein secondary structure will be of great help to the study of protein complexes in human body and the prevention and treatment of medical diseases.In this thesis,a prediction method based on protein length classification modeling is proposed.Follows are the main work of the thesis:(1)Online server evaluation of protein secondary structure.In order to have a deeper understanding of the research status of protein secondary structure prediction,350 protein data were downloaded from the protein structure database to evaluate and test the six classic prediction servers at home and abroad: PSRSM?MUFOLD?SPIDER?RAPTORX?JPRED and PSIPRED,the experimental results showed that PSRSM server obtained the optimal Q3 and Sov accuracy.(2)A prediction method based on linear classifier optimization modeling.This method uses 25 PDB as training set and CB513 as test set.Adjusting and optimizing the regularization coefficient and linear coefficient threshold for linear classifier by manual optimization and Bayesian optimization.Finally,the experimental results after optimization are 0.1% higher than before.(3)Classification modeling method based on protein length segmentation and deep convolutional neural network.In this method,Astral and Cull PDB,two classical big datasets in protein prediction,were firstly selected and combined into a whole data set Astra Cull.The proteins in the dataset were then divided into four or six segments by length.Next,optimizing the convolution kernel size,number,number of network layers,learning rate and regularization coefficient of the deep convolutional neural network on each section respectively to find the optimal network model structure.Finally,4-segment network model and 6-segment network model are obtained.In order to make the experimental results more accurate,more protein characteristic information was obtained for the experiment,and then an optimized protein 6-segment network model was proposed.The results show that the optimal accuracy of the 6-segment model is higher than that of the 4-segment model,and the highest Q3 accuracy of the 6-segment model in datasets CASP9,CASP10,CASP11,CASP12 and CB513 is 83.67%,78.99%,78.53%,71.52% and 85.94%,respectively,of which the results of CB513 are better than many classic prediction methods.(4)Classification modeling method based on Bayesian optimization.This method divides the Astra Cull data into 6 groups by the length of the protein,and uses Bayesian optimization to optimize the four parameters of the convolutional neural network on each group: convolution depth,learning rate,regularization coefficient,and stochastic gradient impulse,thus obtaining the optimized convolutional neural network model,the optimal Q3 accuracy of this model in the datasets CASP9,CASP10,CASP11,CASP12 and CB513 is 80.08%,77.74%,77.06%,69.95% and 83.09%,respectively.The experimental results indicate that the protein length classification modeling method proposed in this thesis is effective,which takes the influence of long-and short-range information on structure prediction into account.It can not only shorten the training time,but allow the protein to choose the model with high similarity to its length for prediction,which improve the prediction accuracy.At the same time,the use of deep learning method also improves the accuracy,which also points out the direction for the prediction of protein secondary structure in the future.
Keywords/Search Tags:Protein secondary structure, length information, deep convolution, Bayesian, Multi-classification mode
PDF Full Text Request
Related items