Font Size: a A A

Folding And Unfolding Related Information In Protein Sequence

Posted on:2012-08-07Degree:DoctorType:Dissertation
Country:ChinaCandidate:H S XuFull Text:PDF
GTID:1100330338491439Subject:Biomedical engineering
Abstract/Summary:PDF Full Text Request
Protein, as a polypeptide biological macromolecular polymerized from different amino acids, plays an important role in structural constitution and functional activity of organism. And all these take proteins'forming certain spatial conformation as a premise. However, as a polypeptide, the theoretical conformational space of protein is immense. Therefore, as emphasized by Anfinsen's theory, protein's amino acid sequence generated from evolution should contain the information for the assumption of the native secondary and tertiary structures. Furthermore, the functional role of protein obliges it to keep thermostable. And this thermostable information is encoded by protein sequence all the same. As a matter of fact, the fold-ability, by which protein can fold into certain conformation, and the thermostability, by which protein can function, are both the characteristic properties of protein sequence. Elucidating of these characteristic properties would do help to protein structure predicting and new protein sequence designing.One property for the evolved protein sequences to maintain fold-ability and thermostability is the conservation of amino acid substitution. This conservation characterize has been applied in constructing amino-acid scoring matrix and in recognizing protein fold. Any score matrix has an implicit substitution frequency that reflects the training set's amino acid substitution patterns. And aligning sequence based on amino-acid score matrix is in fact evaluating conservation of the aligned sequences using the substitution conservation of the score matrix's training sequence set. Therefore, selecting a training set is important for constructing score matrix and optimizing alignment. Herein, three protein structure class-specific score matrices (ALPHASUM, BETASUM and AFBETASUM) were constructed based on thestructure alignment of low identity (<25%) all-alpha, all-beta, and alpha/beta proteins, respectively. The low identity level is likewise set to deal with the alignment in the'twilight zone'. The class-specific score matrices were significantly better than a structure-derived matrix (HSDM) and three other generalized matrices (BLOSUM30, BLOSUM60 and Gonnet250) in alignment performance tests. The optimized gap penalties presented here also promote alignment performance. Thus the advantage and the necessary of constructing matrix at protein class level were confirmed.To elucidate the characterize of the conservation of amino acid substitution by which protein sequences maintain its fold-ability and thermostability, some secondary structure-specific score matrices were constructed for different protein class. The amino acid cluster trees were constructed based on these score matrices, and the amino acid substitution patterns were analyzed. It was suggested that common substitution patterns of different protein classes are accompanied by some subtle differences in substitution patterns in different protein classes. The distinct substitution patterns of different protein classes are in fact the characteristics of one kind of secondary structure in the protein classes. Moreover, even the same kind of secondary structures in different protein classes have different substitution patterns, which implies that supersecondary structures somehow influence substitution patterns. It can be seen as a complement of the local environment restriction theory primarily suggested by Overington et al. Thus, we suggested that environment-specific matrices be constructed based on different protein classes.Researches concerning amino acids'score matrix and substitution pattern are related more to protein sequence's conservation. Given the nonlocal character of the information about protein fold-ability and thermostability, the coupling information of two model proteins'sequence set was analyzed by using the modified Statistical Coupling Analysis (SCA) method. Result suggested that the statistical conserve energy from SCA method could evaluate the site conservation of protein family sequences properly. And the average statistical coupling energy is a suitable criterion for those structural and functional important sites. Analyzing of several sites'coupling data indicated that local and nonlocal perturbing modes existed in these protein domains. Analyzing of the clustered and rearranged coupling data showed that spatially proximate sites do not have similar perturbing effect and/or coupling response mode consequently. And for a specific protein structure type, there are several different perturbing modes which are related to different sites combination. The dominated perturbing modes within it are closely related to the maintaining of structural stability and function of the domain. Different perturbing modes interplayed through some common perturbing sites and responding sites. Different perturbing modes have some common sites also implies that protein sequence's fold-ability and thermostability information is superposable to some extent. That is one site could play multiple roles in the structure.Based on the protein structural class-specific score matrices and the corresponding secondary structure-specific score matrices, some new protein folding recognition method could be developed. The coupling information of different protein fold can be used to aid protein folding recognition. Extracting of coupling information for different protein fold would do help to design new sequence of that family or to modify existing sequence to improve its fold-ability and/or thermostability. A more effective way to improve protein sequence alignment and protein fold recognition should combine substitution conservation information and site coupling information.
Keywords/Search Tags:Protein folding, protein thermostability, score matrix, amino acid substitution pattern, coupling analysis
PDF Full Text Request
Related items