Font Size: a A A

Constructing Paleontological Phylogenetic Tree Based On Hierarchical Inference And Parsimonious Clustering

Posted on:2020-06-29Degree:MasterType:Thesis
Country:ChinaCandidate:D D ShenFull Text:PDF
GTID:2370330590481881Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The construction of paleontological phylogenetic tree is an important way to explore the origin of early organisms.For the paleontology,the phenotype data collected from fossils is the main material that can be used for phylogenetic analysis.However,the missing values and inapplicable states in these phenotype data often hinder the construction of paleontological phylogenetic tree.In the view of above problems,this thesis integrates some prior knowledge of phylogenetic analysis with the characteristics of paleontological phenotype data.Therefore,a method for the construction of paleontological phylogenetic tree is proposed based on hierarchical inference and parsimonious clustering.The mainly content of this thesis is as follows.(1)According to the logical dependence between features in phenotype data,the hierarchical structure model of feature is established.Based on this model,a method for missing values imputation is proposed in phenotype data.Specifically,a hierarchical inference framework is proposed.Moreover,the distance-weighted K-Nearest Neighbor is introduced into the hierarchical inference framework to estimate the missing values in phenotype data.Experiments show that the above method is superior to the fuzzy optimization in the traditional method under many missing ratios.(2)Aiming at the problem that the inapplicable states in phenotype data leads to a difficulty of phylogenetic tree reconstruction,this thesis proposes a method for constructing and optimizing phylogenetic tree based on parsimonious clustering.Two stages are proposed in this method,i.e.,phylogenetic tree reconstruction and optimal tree search.During the process of phylogenetic tree reconstruction,parsimonious clustering is proposed by integrating the hierarchical structure model and polarity of features.For the parsimonious phylogenetic tree search,a heuristic optimization algorithm called simulated annealing algorithm is employed for the optimal tree selection according to principle of parsimony.Compared with the existing methods for the treatment of inapplicable state,experiments show that the above method reduces the RF distance by about 0.125 on average.(3)Based on the analysis of missing values and inapplicable states in paleontological phenotype data,based on(2)and(3),a method for paleobiological phylogenetic treeconstruction based on hierarchical inference and parsimonious clustering is proposed.The method firstly combines the distance-weighted K-nearest neighbor method and the hierarchical inference framework to estimate the missing values in the paleontological representation data,reducing the ambiguity of the data on the premise that the data can be interpreted.And then the method in(2)is employed to construct and optimize the phylogenetic tree with inapplicable states.Experiments show that the paleobiological phylogenetic tree constructed by the above method is basically consistent with the generally accepted phylogenetic tree topology.Therefore,this method is effective in the construction of paleontological phylogenetic trees with missing values and inapplicable states.In summary,the proposed method is more suitable for constructing paleontological phylogenetic trees with missing values and inapplicable states.The effective construction of paleontological phylogenetic trees provides more evidence for paleontologists to explore the origin of organisms.
Keywords/Search Tags:Phylogenetic analysis of paleontology, Hierarchical architecture of features, Missing values imputation, Inapplicable states, Phenotype data
PDF Full Text Request
Related items