Font Size: a A A

Protein structure alignment and detection of conserved tertiary contacts

Posted on:2006-07-05Degree:Ph.DType:Dissertation
University:Stanford UniversityCandidate:Ebert, JessicaFull Text:PDF
GTID:1450390008958216Subject:Biophysics
Abstract/Summary:
Protein structure alignment is a fundamental tool for understanding the nature of fold space and for investigating the relationships among structure, function, and sequence. FoldMiner is a novel algorithm capable of detecting structural motifs in a database of proteins without the need for multiple structure or sequence alignments and without relying on prior classification of proteins into families. During each iteration of the algorithm, a motif is defined from the current set of statistically significant alignments and is used both to recruit additional structural neighbors and to discard false positives. FoldMiner thus achieves high specificity and sensitivity by distinguishing between homologous and non-homologous structures by the regions of the query to which they align. When two proteins of the same fold are superimposed, highly conserved secondary structure elements tend to align to one another, suggesting that FoldMiner consistently identifies the same motif in different members of a fold. A web interface providing access to all of FoldMiner's parameters and a variety of tools for visualization has been developed as well.; FoldMiner's pairwise alignments are performed by LOCK 2, whose improved accuracy with respect to the original version of LOCK arises from more precise considerations of structural changes that have occurred throughout evolution. LOCK 2 is symmetric, its scoring system is metric, and its alignments are highly self consistent.; Because proteins with vastly different functions and sequences can nevertheless adopt the same global fold, it is necessary to distinguish between closely related families on a more local level. Tertiary residue contacts conserved in only one of several globally similar families constitute a structural "fingerprint" of that family and have discriminatory power in differentiating between closely related families. In each of two test cases in which a single hidden Markov model (HMM) detects two families, these discriminatory contacts are shown to contain information capable of distinguishing one family from the other. When the HMM is weighted at discriminatory contact positions for one of the two families, it gives higher scores to members of that family than it does to members of the other family while still producing biologically relevant sequence alignments.
Keywords/Search Tags:Structure, Alignments, Conserved, Fold, Family
Related items