Font Size: a A A

Proper Distance Metrics For Phylogenetic Analysis Uising Complete Genomes Without Sequence Alignment

Posted on:2011-04-18Degree:MasterType:Thesis
Country:ChinaCandidate:X W ZhanFull Text:PDF
GTID:2120330332964073Subject:Applied Mathematics
Abstract/Summary:PDF Full Text Request
A shortcoming of most correlation distance methods based on the compositionvectors without alignment developed for phylogenetic analysis using complete genomesis that the"distances"are not proper distance metrics in the strict mathematical sense.In this paper we describe four string distances on complete genomes to reconstructphylogenies.These distances are based on common words shared by raw genomic se-quences and do not requrie processing steps such as sequence alignment. Then wepropose two new correlation-related distance metrics ,chord distance and piecewisedistance.We use these distances to replace the old one in our dynamical language ap-proach.Four genome datasets are employed to evaluate the effects of this replacementfrom a biological point of view. We find that the two proper distance metrics yieldtrees with the same or similar topologies as/to those using the old"distance"and agreewith the tree of life based on 16S rRNA in a majority of the basic branches. Hence thetwo proper correlation-related distance metrics proposed here improve our dynamicallanguage approach for phylogenetic analysis.
Keywords/Search Tags:phylogenetic analysis, complete genome, composition vector, correlation-related distance metric
PDF Full Text Request
Related items