Font Size: a A A

Studies On Proline In Peptide Chain By Bioinformatics And Molecular Dynamics

Posted on:2006-10-19Degree:DoctorType:Dissertation
Country:ChinaCandidate:M L WangFull Text:PDF
GTID:1100360152475226Subject:Fermentation engineering
Abstract/Summary:PDF Full Text Request
Proline is very important to protein structure and function. Proline possesses a special circlestructure which restricts the structure of peptide chain and brings biological macromolecular specialcharacters. The conformation changing is the rate-limiting step in many proteins folding andunfolding and strengthens the protein interaction based on conformation recognition.Exploring the relationship between the proline conformation and the primary structure ofprotein and constructing an applied model can provide more theoretical guidelines to proteinstructure prediction and X-ray diffraction data analyzing, help us understand the influence ofproline on protein, obtain universal information about the relationship between proline and proteinstructure and provide more theoretical guidelines to protein engineering. Molecular dynamicssimulation will discover more information about influence of proline on the protein structureforming.So, in this dissertation, systemic studies on proline in protein are carried out by bioinformatics.A large prolyl peptide bond dataset was constituted from the list of the structures which hadresolution better than 0.25nm, sequence identity between each pair of the sequences less than 30%.The number of chains in the list was 2401. The number of cis prolyl peptide bonds is 1221 and thatof trans prolyl peptide bonds is 26401. Some characters of the dataset were analyzed, including thedistribution of the residues of the peptide bonds' N-terminal, the statistic of dihedral angel, thedistribution of cis peptide bonds in different secondary structure, the proportion of cis peptidebonds.Neural network and linear simulation methods are used to discover the effect to cis-prolineforming of the local amino acid sequence, the amino acid component in peptide chain. Neuralnetworks based on local sequence and amino acid component with the best structures and linearequation are constructed. The results of testing to the networks indicate that the forming ofcis-proline has distinct relationships with local sequence and amino acid component.A new method for peptidyl prolyl cis/trans conformation prediction based on the theory ofsupport vector machines (SVM) was introduced. The SVM represents a new approach to supervisedpattern classification and has been successfully applied to a wide range of pattern recognitionproblems. In this study, six training datasets consisting of different length local sequencerespectively were used. The polynomial kernel functions with different parameter d were chosen.The test for the independent testing dataset and the jackknife test were both carried out. When thelocal sequence length was 20-residue and the parameter d = 8, the SVM method archived the bestperformance with the correct rate for the cis and trans forms reaching 70.4 and 69.7% for theindependent testing dataset, 76.7 and 76.6% for the jackknife test, respectively. Matthew'scorrelation coefficients for the jackknife test could reach about 0.5. The results obtained throughthis study indicated that the SVM method would become a powerful tool for predicting peptidylprolyl cis/trans conformation.The correlation between proline synonymous codon usage and local amino acid, thecorrelation between proline synonymous codon usage and the isomerization of the prolyl peptidebond were both investigated in the E. coli genome by using a novel method based on informationtheory. The results show that in peptide chain, the residue at the first position C-terminal influencesthe usage of proline synonymous codon greatly and proline synonymous codons contain somefactors influencing the isomerization of the prolyl peptide bond.Recently, the poly-L-proline type II (PPII) conformation has gained more and more importance.This structure plays vital roles in many biological processes. We present a SVM prediction methodof PPII conformation based on local sequence. Total accuracy for the independent testing set andestimate of jackknife testing both reached about 70%. Matthew's Correlation Coefficient can reachabout 0.4. By comparing the results of training and testing datasets with different sequenceidentities, it is suggested that the performance of this method is correlated with the sequenceidentity of dataset. The parameter of SVM kernel function is also an important factor. Thepropensities of residues located different positions are also analyzed. By computing Z-score, it isfound that P, G were the two most important residues to PPII structure conformation.Hidden Markov Model (HMM) are also applied to the prediction of the poly-L-proline type IIconformation. The total success rate can reach 63% and Matthew's Correlation Coefficient canreach about 0.3. The general performance of HMM method is worse than that of SVM method, butthis is only simple applying of HMM. Except architecture prior parameter, other parameters haven'tbeen adjusted. So the prediction ability of HMM to PPII needs more investigating.A proline-rich oligopeptide-bradykinin potentiating peptide (BPP) is chosen to research bymolecular dynamics. BPP is important drug aimed to control unbalanced cardiovascular functionsand was the first effective anti-hypertensive ACE inhibitors used in human subjects. Thedistribution of prolyl peptide bond conformations, Ramatrandran distribution, atom distance ofmain chain, distribution of conformation energy after 300 times simulated annealing are analyzed.The results show that both conformations of prolyl peptide bond are not easy to transit from eachother supporting the opinion that both of the BPP's bio-characters are from its conformationsseparately. And the function of BPP may not lie on a conformation. The other peptide bondconformations also have influence. The cis conformation is more compact than trans conformationbut both of them are very flexible except trans-BPP-9a which is so rigid. Distributions of energy fitGaussian distribution.
Keywords/Search Tags:Proline, peptide bond, trans conformation, cis conformation, poly-proline type II, synonymous codon, neural network, support vector machine, hidden markov model, simulated annealing
PDF Full Text Request
Related items