Font Size: a A A

Classification Based On The Sequence Of The Protein Folding Rate And Membrane Protein Function

Posted on:2011-01-02Degree:DoctorType:Dissertation
Country:ChinaCandidate:J Z GaoFull Text:PDF
GTID:1110330332472469Subject:Bioinformatics
Abstract/Summary:PDF Full Text Request
Proteins play an important role in biology. The relationship between protein se-quence, protein structure and function is very important to the protein bioinformatics. This dissertation dedicates the research of relationship of protein sequence and struc-ture, sequence and functional. The main result in this dissertation is composed of four parts:1.A novel algorithm for structure comparison is proposed.We use dynamic time warping to align dihedral angle series, which is represent of protein's 3D structure.This paper demonstrates the alignment score of two proteins from PDB-SELECT database is following the generalized extreme value distribution with parametersμ=94.7697,σ=41.5837,ξ=0.1925.Using this distribution we can compute the p-value of the alignment. The alignment score is a good measure to classify the protein.Comparing with algorithm of structure alignment such as CTSS and so on, our algorithm performs better in efficiency and statistic significant.2.Membrane transport protein plays an important role in living cells.Amino acid occurrence and modified Kyte-Doolittle hydrophobic scales,Ponnuswamy hydropho-bic scales, mean polarity and solvation free energy, which are changed by fast Fourier transform, are used as input vectors of support vector machine. Our method discrim-inates three transporters in Transport Classification Database with the five-fold cross validation accuracy of 72.1%,which increase of 4% than the preliminary literature. Our study suggests that our method can achieve a better accuracy and classify three kinds of membrane transport proteins effectively.3.This article proposes two models for prediction of protein folding rate. The first model selects the features from 531 physical chemistry properties in AAindex database, length of proteins and local structural entropy. Our proposed model is designed for three kinetic types:two-state, multi-state and mixed-state proteins.The correlation be-tween predicted folding rate and experimental folding rates for different folding kinet-ics is 0.790,0.829 and 0.778 respectively. Compared with other models,our proposed method has advantages in less features, simple computation and smaller mean abso-lute errors.The other novel sequence-based predictor, PFR-AF, which utilizes solvent accessibility and residue flexibility predicted from the sequence.The PFR-AF's pre-dictions are characterized by high (between 0.71 and 0.95,depending on the data set) correlation and the lowest (between 0.75 and 0.9) mean absolute errors with respect to the experimental rates.Our models reveal that for the two-state chains inclusion of solvent exposed Ala may accelerate the folding,while increased content of Ile may reduce the folding speed. We also demonstrate that increased flexibility of coils facili-tates faster folding and that proteins with larger content of solvent exposed strands may fold at a slower pace.The increased flexibility of the solvent exposed residues is shown to elongate folding,which also holds,with lower correlation,for flexibility of strands. Two case studies are included to demonstrate the proposed method.4. ORF is the basis of gene identification and genome analysis.In this article,we give the definition and algorithm of p0-MORF, and prove the set of p0-MORF exists and is unique. We also discuss the relationship between CDS and po-MORF in S. Coelicolor A3 (2) genomes.
Keywords/Search Tags:protein structure comparison, functional discrimination of membrane proteins, protein folding rate, solvent accessible surface, p0-MORF algorithm in genome
PDF Full Text Request
Related items