Font Size: a A A

Prediction Of The β-Hairpins, β(γ)-Turns And Four Kinds Simple Super-secondary Structures In Proteins

Posted on:2008-06-04Degree:DoctorType:Dissertation
Country:ChinaCandidate:X Z HuFull Text:PDF
GTID:1100360245487032Subject:Theoretical Physics
Abstract/Summary:PDF Full Text Request
The knowledge of the structure of a protein is important to understand its function. With the success of human genome project,a widening gap appears between rapidly increasing known protein sequences and slow accumulation of known protein structures. Determination of protein structure purely using experimental approaches is time-consuming and expensive.Thus,the theoretical or computational methods for predicting the structures of proteins become increasingly important.Presently,the direct prediction of the protein three-dimensional(3D) structure from sequence is a difficult task.But local structural motifs are with strong sequence signals,and commonly present in the 3D structures,and governing the stability and fold of proteins. Therefore,predicting local structure may help to simplify structure prediction problem,which is a key step of predicting 3D structure.In this dissertation,we investigated the super secondary structure prediction of proteins, especiallyβ-hairpin motifs.In addition,β-turns andγ-turns of secondary structures in the proteins also studied.1.Based on the algorithm of the least increment of diversity,a new algorithm of the increment of diversity combined with support vector machine(ID_SVM) is proposed,to predict theβ-hairpins in the ArchDB40 dataset.And better results are obtained.2.By using of the composite vector with increment of diversity and scoring value to express the information of sequence,and inputting the increment of diversity and scoring value to Support vector machine(SVM),SVM can find the optimization hyper plane in vector space to classify theβ-hairpins and the non-β-hairpins.A new algorithm of the increment of diversity and scoring value combined with support vector machine (ID_PCSF_SVM) for predictingβ-hairpin motifs in the ArchDB40 dataset and EVA dataset (Kumar and Bhasin,Nucleic Acids Research,2005,33:154-159, http://cubic.bioc.columbia.edu/eva/index.html) is proposed.And higher predictive success rates than the previous algorithms are obtained.The overall accuracy of prediction is improved 4%,and sensitive forβ-hairpin is increased 6%.We also applied our method to predict super secondary structure of the ArchDB40 dataset,and better results are obtained for training set 5-fold cross-validation and independent testing set.3.The increment of diversity,scoring value and predictive secondary structure information together are selected as inputting parameters of the SVM.A new algorithm for predictingβ-turns in the 426 proteins andγ-turns in the 320 proteins is proposed.The overall prediction accuracy and Matthews's correlation coefficient(Mcc) in 7-fold cross-validation are 79.8%and 0.47,respectively,for theβ-turns.And the Mcc in 5-fold cross-validation is 0.18 for theγ-turns.4.A database is constructed,which contained 2208 protein chains with higher resolution than 2.5(?) and lower identity than 40%.They contain 6799α-α,6711α-β,6072β-αand 8163β-βmotifs.Based on the diversity increment algorithm,the four types super-secondary structures are predicted by the 3-crossvalidation test.And results show that average prediction accuracy are 78%in the 3-crossvalidation test and 76.7%in jack-knife test for the "822type" for fixed-length pattern with 8 amino acids.If using of the "1041type" for fixed-length pattern with 10 amino acids,prediction accuracy are 83%and 79.8% respectively.5.By using the information of the dipeptide composition and amino acid hydropathy distribution,the predictive results for super secondary structures,β-hairpins,β-turns andγ-turns and is improved.
Keywords/Search Tags:Local structural prediction, Super-secondary structure motif, β-Hairpin, β-Turn, γ-Turn, Increment of diversity, Scoring matrix, Support vector machine
PDF Full Text Request
Related items