Font Size: a A A

Parallelization Research On Maximum Likelihood Method In Molecular Phylogenetic Inference

Posted on:2017-04-23Degree:MasterType:Thesis
Country:ChinaCandidate:J M ZhouFull Text:PDF
GTID:2180330485964656Subject:Bioinformatics
Abstract/Summary:PDF Full Text Request
Phylogeny reconstruction at molecular level to describe the evolutionary process of life is popular now in modern biological research. The common phylogenetic reconstruction algorithms are based on estimation of evolutionary distance, such as UPGMA and Neighbor-Joining (NJ), or based on the optimization principle, such as Maximum Parsimony(MP), Maximum Likelihood(ML), Bayesian Inference(BI), etc. Compared with other algorithms, ML has obvious advantages in statistical consistency, full use of original data and comparability of likelihood trees under the same framework, etc., but the price of these advantages is high as the time complexity of ML is exponential by using global searching. As far as possible to reduce the computing time complexity, practical applications usually apply heuristic reduction algorithm strategy to seek more local optimal results, so that its application is restricted in the field of phylogenetic reconstruction. Based on global search of ML method in phylogenetic reconstruction, this paper adopted mature parallel computing technologies in computer science to do parallelization research on ML method, according to the characteristics of the calculation in the process of likelihood tree reconstruction.First of all, with the analysis of likelihood computation, the serial algorithm Serial-MLE was developed for global search in ML tree reconstruction. By "Four steps method in parallel algorithm design", analysing the parallelizability in both topo space of trees and site space of sequences in ML computation process, and using the lightweight Fork/Join framework to implement the parallel algorithm FJ-Topo for topo space and FJ-Locus for site space. Further experiments showed that FJ-Topo speed up the 3.69 times on topo space, and FJ-Locus speed up 1.55 times on site space. The results illustrated the parallelizability in both topo space of trees and site space of sequences in ML computation in phylogenetic reconstruction.Secondly, in view of the explosive growth of topo space which presented factorial type increasing with the increase of taxa, based on message passing mechanism of MPJ parallel computing framework, the multi-process parallel algorithm MPJ-Topo to accelerate computing in topo space was proposed, meanwhile a hybrid algorithm MPJ-FJ which combined the Fork/Join to accelerate computing in site space was also developed. Experiments showed that MPJ-Topo’s speedup had reached around 7 to 8 times in topo space; under the condition of short sequences, MPJ-FJ showed less acceleration than MPJ-Topo, but with the increase of sequence length, the hybrid MPJ-FJ can promote the acceleration about 10% on the basis of MPJ-Topo.Thirdly, based on the big data processing ability of MapReduce parallel framework, a MapReduce parallel algorithm MR-Topo which run in the explosive growth of topo space was realized; to continue the thought of the hybrid parallel algorithm, another hybrid algorithm MR-FJ which using the Fork/Join to accelerate computing in site space at the same time was also developed. Experiments showed that MR-Topo’s speedup can reach to 52.82 in topo space, MR-FJ’s speedup increase with the increase of taxa and sequence length, the highest can reach to 51.12.In this paper, some new parallel algorithms to accelerate the reconstruction of phylogenetic tree using maximum likelihood method were proposed and tested, according to testing results, especially Mapreduce framework can provide effective means of parallel computing for global searching of optimization of ML method for phylogeny reconstruction under the increasing trend of biological big data.
Keywords/Search Tags:Phylogeny reconstruction, Maximum likelihood method, Parallel computation, Fork/Join framework, MPJ framework, MapReduce framework
PDF Full Text Request
Related items