Font Size: a A A

Algorithms for quartet based phylogeny reconstruction

Posted on:2010-05-12Degree:Ph.DType:Thesis
University:University of Alberta (Canada)Candidate:Wu, GangFull Text:PDF
GTID:2445390002476198Subject:Biology
Abstract/Summary:
Evolution is an important sub-area of study in biological sciences, where given a taxon set, the goal is to reconstruct their evolutionary history, or phylogeny. Many kinds of data associated with the taxa can be deployed for this task and many reconstruction methods have been proposed and examined in the literature. One very recent approach is first to build a local phylogeny for every subset of 4 taxa, which is called the quartet topology for these 4 taxa, and then to assemble a phylogeny for the whole set of taxa satisfying these predicted quartet topologies. In practice, however, the set of predicted quartet topologies may contain conflicting ones, and we would not be able to find a phylogeny that can satisfy all the predicted topologies. This possibility complicates the overall quartet based phylogeny reconstruction and presents an interesting computational challenge. The research goal of this thesis is to reconstruct phylogenies that can reveal the "true" phylogeny as much as possible.;Besides the phylogeny reconstruction algorithms, we also propose a new model-based approach to measure the performance of a quartet based phylogeny reconstruction method, i.e., to analyze the probability of reconstructing the "true" phylogeny. Under this model, we present three efficient algorithms that can reconstruct the "true" phylogeny with high success probabilities.;We conducted extensive simulation studies and experiments using real datasets to compare the proposed algorithms against previously the best algorithms. The experimental results on both the simulated and real datasets demonstrated the out-performance of the proposed algorithms.;One commonly adopted objective is to satisfy the maximum number of predicted quartet topologies. This is the well known Maximum Quartet Consistency (MQC) problem, which has been studied by a number of researchers in the last two decades. In this thesis, we propose two exact algorithms for solving the general MQC problem. The first one is based on a new equivalent representation of the MQC problem we discovered, that is, to search for an ultrametric matrix to satisfy the maximum number of given quartet topologies. A number of structural properties of the MQC problem in this new representation are characterized, through formulating into Answer Set Programming (ASP), a recent powerful logic programming tool for modeling and solving search problems. The second one is a look-ahead branch-and-bound algorithm, which integrates a number of previous efforts on exact algorithms, heuristics, and approximation algorithms designed for the MQC problem, and several improved search techniques we designed, especially a look-ahead scheme, to solve the problem optimally. Additionally, when the given quartet topology set contains only O(n) (n is the number of taxa) conflicting quartet topologies, we propose an exact algorithm to solve the MQC problem in polynomial time.
Keywords/Search Tags:Quartet, MQC problem, Phylogeny, Algorithms, Taxa
Related items