Font Size: a A A

Research On Protein Structure Prediction Based On Domain Cluster

Posted on:2009-08-11Degree:MasterType:Thesis
Country:ChinaCandidate:Z Y MaFull Text:PDF
GTID:2120360242983742Subject:Computer system architecture
Abstract/Summary:PDF Full Text Request
Protein structure decides its function. As more and more complete genomes have been or are being sequenced, the number of protein sequences will continue to grow exponentially. Obviously, experimental approaches are unable to keep up with the need to determine the structures of newly discovered genes. This huge gap is further widened in the current genomic era. For this reason, many computational methods to predict protein structures have been developed to complement experimental structure determination.Homology Modeling has been proved to be the most successful method in protein structure prediction. As progressing of the Project Human Proteomics Initiative, more and more protein structures will be determined in 5 or 10 years, thus we can use these determined structures as templates for the homology modeling to produce a structural coverage for a majority of sequenced genes. However, there are two major shortcomings in homology modeling, deficiency of structure templates and the accuracy of target-template sequence alignment.Proteins evolve with their structural and functional domains as independent units. There are already more than two thirds of the protein domains in the InterPro database whose structures can be found in the PDB database, moreover, it has been found that more than 85% of all proteins contain at least one or multiple conserved sequence domains. This motivates us to propose a method that conserved sequence domains instead of complete protein chains are used as templates for homology modeling. And the main research work is listed as follows:1) Target-template alignment based on the three-dimensional structure information: currently structure alignment is the most accurate one, and often as the norm of other methods. The addition of three-dimensional structural information can often improve the accuracy and sensitivity, and our domain clustering database contains a lot of structural information, so we extract a profile from the three-dimensional structure alignment and build a sequence-profile algorithm. Experiment results show that the sensitivity and accuracy have risen in some degrees.2) Hybrid algorithm based on mixed information: Structural information is very important, but profile extract solely from structure will cause some loss of information because there is no sense in the loop region of structural alignment. Moreover, in view of theory, the more information to added, the more accurate it will be. Therefore, we constructed a profile-profile alignment algorithm called hybrid based on the one-dimensional, two-dimensional and three-dimensional information, and we set a large number of tests in inner and external benchmarks. The experimental results showed that, especially in low similarity, the sensitivity and accuracy have improved markedly in compared with other methods.3) Domain merger method: the protein structure prediction based on domain cluster put forward a new question that is the merger of domain, because the template is only based on domain, not on whole protein. Facing this problem, we have taken the following approach: first, for a target sequence, using hybrid algorithm to decompose it as domain. And then, predict the structure for each domain. Finally merge and optimize the domain structure. We chose some representative sequences for testing and the results showed that in the absence of precise objective template, the performance of this method is very good.4) The realization of the prototype system: We also realized a prototype system, which is to provide an interactive platform. Users can submit protein sequence through it and we predict the structure in the background and finally return the results to the user.
Keywords/Search Tags:protein domain, homology modeling, target-template alignment algorithm, domain merger
PDF Full Text Request
Related items