Font Size: a A A

Research On Protein Structure Homology Modeling Based On Domain Clustering

Posted on:2014-05-24Degree:DoctorType:Dissertation
Country:ChinaCandidate:F RenFull Text:PDF
GTID:1260330401956192Subject:Biomedical engineering
Abstract/Summary:PDF Full Text Request
Protein structure prediction is one of the current bioinformatics research focuses. According the structure-known protein as template, homology modeling is regarded one of successful structure prediction algorithms. However, the following two problems influenced the practical application of homology modeling:the number of protein structures can be used as template is insufficient, and sequence alignment between query and template sequences is not accurate enough, thus make low-resolution of the predicted structure built by homology modeling.For solving the above problems, in this work, we presented a protein structure homology modeling strategy based on domain clustering. In this strategy, we first cluster homologous protein sequences based on sequence alignment algorithms. Then we partitioned and clustered homologous domains into a structure blocks template library. Based on this library, we develop the structure-anchored alignment algorithm to improve the alignment accuracy. Such an effort can effectively solve the drawback of "the accuracy of query-template sequence alignment" in homology modeling. Facing the problem of multi-domain protein sequence hard to find the suitable template, we proposed a domain merge method to improve the accuracy of multi-domain protein structure prediction, which makes up the defect of "deficiency of structure templates" in homology modeling.The innovation work in this thesis can be listed as follows:1、To develop a protein sequence homologous clustering algorithm based on sequence and domain similarity. We first used the program BLASTP to compare the sequences. Then the domain similarity and their organization were included as an additional similarity criterion for filtering false relationship. Then, we converted the similarity matrix into a weight graph and applied the Markov graph-flow algorithm to group the protein homologous. Experimental results on six completely sequenced eukaryotic genomes showed that a significant improvement comparing our results with the NCBI and TIGR clustering results.2、To build a domain-based template library. We first mapped the Interpro database to the PDB database and obtain the domain correspondence relationship. We partitioned and clustered the known domains from PDB database, and build a domain-based template library. Our preliminary results show that our method can be used for the partial prediction for a majority of known protein sequences with better qualities.3、To proposed a domain merge method. Since our template library is only based on domain, for a given multi-domain protein sequence, it’s hard to find the suitable template in the library. Facing this problem, we proposed a domain merge method to extend the range of our library. For the multi-domain protein, we first decomposed it to individual domain, and predicted the structure for each domain. Then, we merge these domain structures into an entire structure and optimized the structure using molecular dynamics to adjust some conflict position or remove some big errors. Preliminary experimental results show that our method is better than the traditional methods, when only part of sequence (domain) has template.
Keywords/Search Tags:Protein, Homology Modeling, Homology Clustering, Alignment, Templete
PDF Full Text Request
Related items