Font Size: a A A

Study On Some Key Algorithms In Protein Structural Class Prediction

Posted on:2015-03-07Degree:MasterType:Thesis
Country:ChinaCandidate:Y LiFull Text:PDF
GTID:2250330428463240Subject:Biochemistry and Molecular Biology
Abstract/Summary:PDF Full Text Request
The human genome project has prompted rapid development of sequencing technique,which makes protein sequence data increase quickly. However, compared with the proteinsequences, an understanding of the protein spatial structure and function is not enough for people.What’s more, the traditional experimental methods and techniques are difficult to meet thedemands of large amounts of protein sequence analysis. Therefore, developing computationaland predictive models that are one of the most important and basic subjects in bioinformatics,play an important role in protein spatial structure and function study. Machine learning is an newmethod to predict unknown protein structure and function from a large number of known protein,which is a complementary method of traditional experiment method for protein study. Proteinstructural class reflects the overall distribution of protein secondary structures and plays animportant role in determining senior structures/functions of proteins. Due to the importance ofprotein structural class prediction, Thus, the prediction of protein structural class is on the basisof protein spatial structure and function prediction, which further influences proteomics research.This study takes protein structural class prediction as the research object, and its main contentsare summarized as follows:As for protein features information extraction, with the correlation information ofmulti-interval between two lines of matrix in mind, we proposed an algorithm to extract theprotein evolutionary information, which is based on the location-specific matrix; according to thedefinition of protein structural class, we designed a position distribution function and gave anextraction method for position distribution information of secondary structure elements; wepropose an extraction approach on protein structural mode information by correlation analysisbetween protein structure domain and secondary structure.In feature selection section, we designed a wrapper based on feature ordering and SupportVector Machine (SVM) to select feature subset, eliminate redundant information and choosecore feature information. As for prediction of protein secondary structure information andprotein evolutionary information, we picked out their optimal feature subset before fusing. The effectiveness of the proposed method was tested with various experiments and compared withthe existing prediction methods.For prediction algorithm, we first compared the k-nearest neighbor algorithm and supportvector machine in protein structural prediction performance. With help of two vote methods, weproposed several multiple classifiers based on binary classification of support vector machine toimprove the accuracy of protein structural class prediction.
Keywords/Search Tags:prediction of protein structural class, feature extraction algorithm, informationfusion, Support Vector Machine
PDF Full Text Request
Related items