Font Size: a A A

Phylogenetic approaches to protein function prediction and protein network analysis

Posted on:2007-09-09Degree:Ph.DType:Thesis
University:Boston UniversityCandidate:Wu, JieFull Text:PDF
GTID:2450390005981803Subject:Engineering
Abstract/Summary:
One of the defining challenges in the post-genomic era is to develop computational and experimental tools to elucidate the mechanisms of bio-molecular interactions within the cell and between the cell and environment. The recent availability of an increasing number of completely sequenced genomes from diverse species has opened new possibilities for systematically annotating large numbers of genes by comparative genomics and deciphering the web of molecular interactions that underlie most cellular systems. High-throughput algorithms that explore the genomic context of genes and capture evolutionary signatures are needed to effectively complement and extend experimental techniques to enhance our knowledge of protein functions at various organizational levels.; This thesis explores phylogenetically based computational techniques that systematically analyze large numbers of genomes to infer protein interaction networks and to quantitatively assign uncharacterized proteins to functional classes. We first pursue a statistical approach to identify protein networks using phylogenetic profiles. Next, we develop a mathematical method to determine the pair-wise correlation in the network and quantitatively assign putative functions to previously unknown genes. In addition to the pair-wise functional linkage analysis, we then develop a framework for extracting higher order information in protein interaction networks. Identified statistically significant protein groups not only enrich the functional annotation that is not possible to obtain in the pair-wise case, but also serve as candidates for logical analysis to further decipher the higher order organization of the cell. Finally we analyze the modular components in protein interaction networks that constitute the cell using our online analysis and visualization tool VisAnt. All the inferences drawn from the methods described herein are available online.; These computational methods successfully identify protein interaction networks and reveal functions of previously unknown genes at different levels. This thesis demonstrates that high-throughput computational methods based on statistics and information theory can greatly enrich our knowledge of protein function. The framework in this thesis serves as a new paradigm for understanding the molecular interaction networks; they can ultimately provide insight at a systems level into the molecular basis of disease.
Keywords/Search Tags:Protein, Interaction networks, Computational
Related items