Font Size: a A A

Mining The Protein Sequence Features Related To Protein Thermostability

Posted on:2017-01-25Degree:MasterType:Thesis
Country:ChinaCandidate:H J CongFull Text:PDF
GTID:2310330503987813Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
As the common macromolecules in the organism, protein form the basis of life, and its many properties have been the focus in the field of biological science.The protein's thermostability is one of the most important protein properties. A protein which have a better thermostability can tolerate in the high temperature environment and it will have a wide application prospect. But, until now, there are only a few sequence or structural features related to the protein thermostability, which have been proposed by the scientists. In this study, we will try to mine the protein sequence features related to protein thermostability with the bioinformatics approach, and predict the mutational sites that can enhance the protein thermostability. The study mainly contains the followings:(1) A homologous protein data set from bacteria were collected, which contained the strains with the different optimal temperatures. The homologous proteins of those strains were extracted from the genome of those strains. The amino acid composition, evolution characteristics, the amino acid index and the secondary protein structure were employed to discover the residues which have an obvious effect on the optimal temperatures.(2) A enzyme data set in which contained the proteins with the different optimal temperatures were constructed. A sequence feature extraction method, the fuzzy matching of short segments to the proteins, has been proposed in this study. A short segment could be matched to a complete protein sequence according to a certain rule. Then the match frequencies of the short segment and the correlation coefficient between the frequencies and the optimal temperatures were calculated. The short segments that had a high correlation coefficient with the optimal temperatures were selected to construct the characteristic short segment library. Based on those characteristic segments in the library, we could predict the mutational sites, which are related to the protein thermostability. Compared to the traditional prediction method and prediction data in the thermodynamics mutation database, the results indicated that this method showed a good prediction performance and could be useful in experimental design.(3) The method was used to predict the mutation in this study. Here, a mesophilic amylase and a thermostable amylase were selected as the experimental materials. We use the method to predict the mutational sites, which was related to the protein thermostability of amylase. The results also revealed that the method could find the corresponding sites related to the protein thermostability of amylase.The prediction method proposed in this study is a kind of local feature matching method. It could identify the target region or the sites of a protein quickly, which are related to the protein thermostability. Compared to the prediction global method, it can make biologist easy to design the protein mutation and understand the relationship between the protein sequence and its function.
Keywords/Search Tags:Protein sequence, Data mining, Protein thermostability, Optimal temperature, mutation prediction
PDF Full Text Request
Related items