Font Size: a A A

Research On Bayesian Inference Sampling Algorithm In Molecular Tree Space

Posted on:2024-05-16Degree:MasterType:Thesis
Country:ChinaCandidate:X P LiFull Text:PDF
GTID:2530307091465444Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the advancement of modern biological technology,the collection of molecular sequence data has become increasingly abundant,leading to the continuous expansion of the phylogenetic topological space for studying the historical relationship between species.In this case,the Bayesian inference method based on the Random Walk Metropolis(RWM)algorithm faces the problem of slowing down the mixing speed.To ensure the efficiency and accuracy of Bayesian inference,it is particularly important to optimize the MCMC sampling algorithm.In recent years,as one of the most advanced algorithms in the Markov Chain Monte Carlo(MCMC)family,the Hamiltonian Monte Carlo(HMC)algorithm has been initially applied in the field of phylogenetic analysis.It can reduce a large number of random walks in the traditional MCMC algorithm,thereby speeding up the mixing speed of Markov chains and improving computational efficiency.This paper provides an in-depth look at the limitations faced by the RWM algorithm when sampling the space of molecular phylogenetic trees and the challenges of HMC methods for dealing with multimodality in phylogenetic analysis.To overcome these problems,this paper proposes a mixed-path Hamiltonian Markov Monte Carlo(MPHMC)method to improve the efficiency and accuracy of phylogenetic analysis.In the complex multimodal developmental tree space,a single HMC algorithm cannot escape local high probability regions by obtaining proposals from other modalities.In order to improve the robustness of the algorithm,non-HMC update components for discrete parameters are added to the sampling path of the algorithm without additional computing costs,which are alternated with HMC deterministic update.Then,a branch rearrangement strategy with greater topological changes is introduced into the tree space,and the whole tree space of a posterior distribution can be traversed more freely.Experiments on eight sets of empirical data sets of different scales prove that the MPHMC method can better sample from the correct posterior distribution;when running on large data sets that are difficult to sample,the sampling algorithm of HMC single path may fail,the MPHMC method can obtain more than 15% higher sampling efficiency than the widely used phylogenetic analysis tool Mr Bayes(MCMC).
Keywords/Search Tags:Phylogenetic analysis, Bayesian Inference, Random Walk Metropolis algorithm, tree space, Hamiltonian Monte Carlo algorithm, multimodality, probabilistic path, mixed path
PDF Full Text Request
Related items