Font Size: a A A

The Impact Of Among-Site Rate Variation On Recombination Detection And Its Solution

Posted on:2008-02-09Degree:MasterType:Thesis
Country:ChinaCandidate:J Q DaiFull Text:PDF
GTID:2120360215994271Subject:Biochemistry and Molecular Biology
Abstract/Summary:PDF Full Text Request
For the advantage of molecular phylogenetic tree, it can Expound the problem which is intractable for the classical method. It can reconstruction the molecular phylogenetic tree bases on the nucleotide sequences of different species. Phylogenetic tree includes the topology and the branch length. Topology is the reflect of the relation of the species.Branch length is the reflect of the evolutionary distance.The underlying assumption of most phylogenetic tree reconstruction methods is that there is one set of hierarchical relationships among the taxa. While this is a reasonable approach when applied to most DNA sequence alignments, it can be violated in certain bacteria and viruses ,because recombination is an important source of evolution for bacteria and virus. The resulting transfer or exchange of DNA sequences can lead to a change of the topology in the affected region, which results in conflicting phylogenetic information from different regions of the alignment. If undetected, the presence of these so-called mosaic sequences can lead to systematic errors in phylogenetic tree estimation. Their detection, therefore, is a crucial prerequisite for consistently inferring the evolutionary history of a set of DNA sequences.In the last ten years, a variety of phylogenetic methods for detecting recombination have been developed. Many detection methods for identifying the nature and the breakpoints of the resulting mosaic structure are based on moving a window along the sequence alignment and computing a phylogenetic divergence score for each window position. And a method based on Bayesian and HMM has been developed in recent years.The prerequisite of the existing method for detecting recombination is: there is no variation of substitution along the nucleotide sequence. But it is deviated from the real-world. According to the early research, continuous distributions are also used to model rate variation among sites, and by far the most-commonly used continuous distribution is the gamma distribution. And the shape parameter of gamma distribution is between 0.2 and 3.5.For the prerequisite of the existing method for detecting recombination deviated from the real-world, we research the impact of among-site rate variation on recombination detection and its solution. We appiled the gamma distribution to model the rate variation among sites, and product the nucleotide data with different shape parameter. The model of detecting recombination is modeled with the HMM and sample the parameter form posterior distribution using Monte Carlo Simulation. Our research indicates the accuracy of the existing method for detecting recombination has the correlate with shape parameter. The accuracy declines with the shape parameter reduce, and the stability becomes poor. The method proposed by us can detect the recombination more accuracy especially when the shape parameter is small.
Keywords/Search Tags:recombination, evolution rate, HMM, Gamma distribution, Monte Carlo
PDF Full Text Request
Related items