Compositional Data Dnalysis Based On Graphical Model

Posted on:2022-02-27

Degree:Master

Type:Thesis

Country:China

Candidate:C H Niu

Full Text:PDF

GTID:2480306482995909

Subject:Statistics

Abstract/Summary:

PDF Full Text Request

Compositional data is exists in many practices problems,such as the chemical composition of rocks and the consumption structure of residents.One of the concerned issues in the research of compositional data is the interaction relationship between the component variables.However,due to the unit sum—special numerical characteristics of compositional data,this problem becomes more challenging.This paper uses a graphical models to describe the relationships between the variables of the compositional data.This paper is divided into two parts.The first part is the inference of the graphical model structure based on the stability selection.On the basis of the gCoda method,we refer to our method of using stability selection to select the parameters in the penalty likelihood function of the compositional data as ss.gCoda.This method utilizes the information of multiple λ s and improves the accuracy of the estimation.We First write the penalty likelihood function of the compositional data-combining the negative log likelihood with l₁ penalty,and then use the MM algorithm to solve it by converting the objective function to a standard glasso problem,and then use stability selection to select the best penalty parameter for this graphical model.Through numerical simulations,we find that the values of tpr and mcc of the ss.gCoda method are higher than those of the gCoda method.By ss.gCoda,more true edges can be found,which leads to a better recovery performance,and also the occurrence of empty model and tpr being zero can be avoided.The second part is the structure learning of the Bayesian network for compositional data.We write the Bayesian network in the form of a matrix according to the joint probability.A DAG can be obtained by the search-based method,as well as its corresponding completed partially directed acyclic graph（CPDAG）and skeleton graph.Through numerical simulations.we find that for a single model,the shd（structural Hamming distance）between the estimated DAG and the real DAG is relatively stable,as well as that between the estimated CPDAG and the real one.This method performs good in recovering true edges.Finally,we apply the gCoda,ss.gCoda and search-based method to the ash data and olive data respectively to explore the relationships between the components in the dust and the compositions in the olive oil.

Keywords/Search Tags:

Compositional data, Graphical model, Bayesian network, Stability selection

PDF Full Text Request

Related items

1	Model Selection Of Graphical Model Based On Resampling Method With Missing Data
2	Bayesian Variable Selection With Informative Priors For Integrating Data Structural Information
3	High-dimensional Graphical Learning And Application Based On Sparse Data
4	Modeling On The Graphical Model In Microbiome Data Based On Bayesian Neighborhood Selection
5	Decomposition And Collapsibility In Graphical Models
6	Complex Bayesian models: Construction, and sampling strategies
7	Gaussian Graphical Model Selection By The E-MS Algorithm
8	Research On Basic Theory Of Graphical Models
9	The Heteroscedastic Statistical Analysis Based On Compositional Data
10	High Dimensional Stability And Multi-variable Selection Diagram Of The Model