Font Size: a A A

Generalized mathematical models for the reconstruction of evolutionary trees

Posted on:2003-05-29Degree:Ph.DType:Dissertation
University:University of California, Los AngelesCandidate:Schadt, Eric EmilFull Text:PDF
GTID:1460390011480747Subject:Mathematics
Abstract/Summary:
This dissertation generalizes previous models for nucleotide and codon substitution and rate variation in molecular phylogeny. The single nucleotide substitution model is a generalization of Kimura's Markov chain model for single nucleotide substitutions and incorporates more flexible transition rates and consequently allows irreversible as well as reversible chains. The codon substitution and rate variation models are extensions of this basic nucleotide model. In developing these models and associated algorithms, particular attention is paid to (a) reversibility of the process, (b) acceptance and rejection of proposed codon changes, (c) varying rates of evolution among codon sites, and (d) the interaction of these sites in determining evolutionary rates. To accommodate spatial variation in rates among sites in a given sequence, Markov random fields rather than Markov chains are introduced. Because these innovations complicate maximum likelihood estimation in phylogeny reconstruction, it is necessary to formulate new algorithms for the evaluation of the likelihood and its derivatives with respect to the underlying kinetic, acceptance, and spatial parameters. Further, to derive the most from maximum likelihood analysis of sequence data, it is useful to compute posterior probabilities assigning residues to internal nodes and evolutionary rate classes to codon sites. It is also helpful to search through tree space in a way that respects accepted phylogenetic relationships. An integrated model incorporating generalized nucleotide and codon substitution rates, site-to-site heterogeneity, branch-to-branch heterogeneity, and correlation between spatially separated sites, has been implemented in the software program LINNAEUS, which has been made freely available. The phylogeny program LINNAEUS is applied to several data sets and comparisons to other models are presented. Applications of decoding algorithms are applied to the data to demonstrate that the more generalized models can enhance biological interpretation of sequence data. The implementation of the algorithms is discussed from the standpoint of performance and model flexibility.
Keywords/Search Tags:Model, Codon substitution, Nucleotide, Evolutionary, Generalized, Data, Algorithms
Related items