Font Size: a A A

Analyzing comparative sequence data to understand genome function and evolution

Posted on:2011-04-08Degree:Ph.DType:Thesis
University:The George Washington UniversityCandidate:Prasad, Arjun BFull Text:PDF
GTID:2443390002965492Subject:Biology
Abstract/Summary:
The ever-accelerating production of genome sequence from numerous species is providing new opportunities to examine evolution and evolutionary processes. My thesis work aimed to explore applications of comparative genome sequence datasets. As a first step we developed a computational method (ExactPlus) that takes advantage of the experiments conducted by natural selection to identify conserved non-coding sequences. This method proved comparable to several other methods of identifying conserved non-coding sequences, and was successfully applied to identify candidates for functional assays of gene-regulatory potential. We next explored the utility of large comparative genome sequence datasets for inferring the phylogenetic relationships among mammals. The large amount of data allowed high-confidence inferences to be made, even for difficult to resolve taxa (such as Atlantogenata, Glires, and Theria), however for this to be successful, we had to carefully control for sources of bias, such as base composition and alignment error. We found a remarkable level of heterogeneity in tree support among regions. To better understand these patterns, we developed a sliding window-based approach (PartFinder) to identify the boundaries of congruent blocks, and validated this method by using it to examine the genetic relationships among human, chimpanzee, and gorilla. In aggregate, this body of work demonstrates the utility and promise of comparative genome sequence datasets when combined with evolutionary and genomic techniques.
Keywords/Search Tags:Sequence, Genome
Related items