Font Size: a A A

Study On The Models Of Protein Structure Prediction

Posted on:2009-04-11Degree:DoctorType:Dissertation
Country:ChinaCandidate:O Y ShiFull Text:PDF
GTID:1100360245984393Subject:Biomedical engineering
Abstract/Summary:PDF Full Text Request
ObjectiveThe purpose of this study is to improve the accuracy and efficiency of protein structure prediction as expatiated followings:1. On the basis of 3-state hidden Markov model (HMM) for protein secondary structure prediction, proposed 7-state and 15-state optimized HMMs; combined these optimized HMMs with BP neural network(BPNN) based on sequence-profile-based HMM to improve the accuracy of protein secondary structure prediction.2. Given protein secondary structures as well as evolutionary information to predict disulfide connectivity of protein-folding prediction, expecting to strongly reduce the search in the conformational space, to improve the prediction efficiency.MethodsThe object of optimized HMMs was a dataset containing 492 proteins, 82272 amino acid residues, which filtered from the dataset CB513 collected by Cuff and Barton. Randomly divided these proteins into 7 subsets and the secondary structure contents were similar in ail the subsets: about 35% of residues inα-helix, 23% inβ-strand and 42% in coil with DSSP assignment. Applied 7-state, 15-state HMM and hybrid models of BPNN-HMM to predict protein secondary structure, evaluated their prediction accuracy by 7-fold cross validation. Finally, analyzed and compared these prediction results.The object of disulfide connectivity prediction was a dataset containing 252 protein sequences selected from the SWISS-PROT database, each protein having at least two and at most five intra-chain disulfide bonds. Firstly, analyzed the bias in the secondary structure preference of free cysteines and cystines and then developed a BPNN. The inputs of the neural network were the symmetric flanking residues about both cystines of a potential disulfide bond, along with the secondary structure of the residues and PSI-BLAST-determined evolutionary information (PSSM). Finally, evaluated the prediction accuracy by 4-fold cross validation and analyzed prediction results. ResultsThe Q3, and SOV of 7-state HMM are increased by 3.11% and 6.15% compared with 3-state HMM's. QE is improved 6.49%; The Q3 and SOV of 15-state HMM are 0.18% and 1.8% better than those of 7-state HMM. QE is improved 5.74%; The Q3 and SOV of 15-state HMM combined with evolutionary information have been found to be 8.36% and 8.2% better than those of single sequence 15-state HMM. QH, QE, QC are improved 10.8%, 15.8%, 3.9% respectively.Compared the prediction accuracy of hybrid models with two layers BPNN's, the Q3 and SOV are 1.11% and 1.69% better than those of two layers BPNN's. QH, QE, QC are improved 1.3%, 1.02%, 4.6% respectively.In the study of disulfide connectivity prediction, under the same BPNN architecture (the length of window =15, the number of hidden unit =50), the Sn, Sp, Mcc, Qc and Qp of combined the secondary structures are increased by 3.06%, 0.69%, 0.041,1.1% and 2.78% compared with only PSSMs encoding.Conclusions1. The results show that protein secondary structure prediction performance of 7-state HMM is better than 3-state HMM; 15-state HMM is similar to 7-state HMM, but benefitsβ-strand prediction; adding evolutionary information on the basis of 15-state HMM, the prediction performance is better.2. The hybrid models of BPNN and HMM to predict protein secondary structure can get higher prediction accuracy than two layers BPNN model.3. On the basis of BPNN, combined the protein secondary structures with PSSMs to predict disulfide connectivity is feasible and effective.
Keywords/Search Tags:protein secondary structure prediction, 7-state HMM, 15-state HMM, BP neural network, disulfide bond prediction
PDF Full Text Request
Related items