Font Size: a A A

Title Prediction Of Protein Secondary Structure Based On Hidden Markov Model

Posted on:2018-10-06Degree:MasterType:Thesis
Country:ChinaCandidate:W P LiFull Text:PDF
GTID:2370330518458906Subject:Control Engineering
Abstract/Summary:PDF Full Text Request
After the launch of the human genome project since the end of last century,by scientists devoted to overcome difficulties,finally completed the human genome map at the beginning of this century,since then,also known as the post genome era and proteomics research is the genome era one of the most important research topics in life science.Wanted to study the proteome,the first task is to study the function and structure of the proteome.And scientific common cognitive is protein secondary structure prediction is the pioneer of its whole structure prediction,is the first.Its principle is based on the part of the structure of amino acid sequence classification;finally get the result of what we need.Therefore,there is a high accuracy of protein secondary structure prediction method to the study of subsequent like has an important role.Protein secondary structure prediction methods since the middle of the last century is to study and put forward,it has been many effective ways,and in this paper,the method used is the use of Hidden Markov Model(HMM)to predict.HMM is to explain a hidden and unknown parameter of Markov process,its status is unable to be intuitive observations,but can through the analysis of the sequence of test vectors and the result is obtained.The use of the data set is the typical protein data sets,the CB513 data set.To deal with it,to get rid of some of them do not have general protein sequences,the remaining 492 protein sequences.Selection will be randomly assigned these sequences,article 420 of the protein sequences as a test sequence,we will be randomly assigned into the 10 experimental groups,each group a total of 42 protein sequence,and the random is divided into seven equal pieces,each equal parts have 6 protein sequences,used to improve accuracy of 7-cross validation method,the 6 portions of the equal parts as the training set,the remaining one as a test set,test each other,a total of six experiments,and then a total of 10 groups in the experimental group,namely 70 times to experiment on..For the use of the HMM model,single residue probability experiment as the main parameters.Get the final overall accuracy reached more than 58%.Of course,this method can also improve,hope in the later study can be further optimized algorithm,makes a further improvement of accuracy.
Keywords/Search Tags:Protein secondary structure prediction, Hidden Markov model(HMM), The CB513 dataset, Groups of protein sequences, 7-Cross validation
PDF Full Text Request
Related items