Font Size: a A A

A System To Delineate Protein Domains Based On Refolding Free Energy

Posted on:2005-04-28Degree:DoctorType:Dissertation
Country:ChinaCandidate:Z Q XieFull Text:PDF
GTID:1100360125469053Subject:Biochemistry and Molecular Biology
Abstract/Summary:PDF Full Text Request
Domain is a protein architecture under the level of subunit. It might be definedas a basic unit of structure, function, folding, evolution and design. Differentcombinations of domains lead to the formation of diverse tertiary structures withdiverse function for proteins. The delineation of domains for a protein is importantboth conceptually and practically, which remains up to date a challenging andunsolved problem. It is the principle to delineate domain that is the key to solve theproblem, which must be rational. Based on the idea that protein domain is a foldingunit, it is Gibbs free energy difference between folding and unfolding states thatdrives protein domain folding. It has been widely assumed that the native state isglobally thermodynamically stable with a minimum free energy. Protein domainshould be structurally compact and thermodynamically favorable since it is assumedto be an independent folding unit. However minimum free energy, in turn, is thecommon sense for the definition of domain as mentioned above. Based on the above idea, we propose a method to delineate protein domainbased on refolding free energy. Given a protein composed of two continuousdomains, we firstly divide it into two parts at different site along residue sequence,and then calculate refolding free energy and get a series of values of refolding freeenergy corresponding to different cuts. The optimal division is chosen from thesecandidate cuts based on refolding free energy. By using this method, 50 proteinshave been analyzed. The boundaries for most proteins agree with the data reportedin literature. There are a few examples discussed in details that seem morereasonable, although they are not identical with those as reported in the literatures.In conclusion, the refolding free energy-based method is rational and practical. In order to extend its ability to delineate discontinuous domains in protein, wedevelop a new approach applicable to multi-domain both of continuous oriv 英文摘要discontinuous, and then implement it using C programming language and create anintegrated system for delineation of protein domains, namely PDOM. The procedureis as follow: first, by constructing a residue-residue contact matrix, applyingcorrespondence analysis, and then selecting optimal partition function of a proteinaccording to refolding free energy and some empirical scoring functions, wepartition protein into two-domain candidates; second, by repeating the first steprecursively on each candidate domain, we partition protein into multiple domains tillthe candidate domain could not be partitioned into two smaller domain candidates .When compared with the manual partition results reported in literatures bycrystallographers, PDOM achieves an accuracy of 76% on a test data set of 55protein structures which have been frequently used. The differences in 13 proteinsbetween PDOM, literature as well as SCOP have been discussed extensively.
Keywords/Search Tags:domain partition, free energy, contact matrix, correspondence analysis, score function
PDF Full Text Request
Related items