Font Size: a A A

The Research On Estimation And Application Of Gaussian Graphical Model

Posted on:2017-03-02Degree:MasterType:Thesis
Country:ChinaCandidate:L Y YangFull Text:PDF
GTID:2310330536955766Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
As the developing of biotechnology and the completion of Human Genome Project,people got much information about gene sequence.Researching on gene sequences is a cutting-edge topic and Gaussian Graphical Model(GGM)has wide application in analysis of gene sequence.Covariance matrix and the inverse covariance matrix(precision matrix)is widely used in various fields,including principal component analysis,linear classification,quadratic discriminant analysis,as well as Gaussian model selection.Thus,a better estimate of covariance and precision matrix in the field of big data and statistical machine learning becomes quite important,but in the high-dimensional background,when the dimension of the data is much larger than the sample size of the data,it is often very difficult to obtain stable and accurate and precise covariance matrix estimation,the classical estimation method based on a fixed dimension and large sample size is no longer applicable,and high-dimensional background challenge is the high computational cost,therefore in high dimensional applications we must propose a estimation method for calculating effectively,in this paper we propose a L1 norm minimization method to perform precision matrix and accurate the Gaussian graphical model estimation research.Gaussian graphical model selection and the precision matrix estimation is closely linked,Gaussian graphical model encodes conditional independence structure between random variables,and is widely used in probability theory,especially Bayesian statistics and statistical machine learning.When the random variables is Gaussian distribution,Gaussian graphical model estimation is exactly estimation of precision matrix support set.In this paper,we introduce a new L1 norm minimization method to study the sparse and non-sparse precision matrix estimation,and it is no longer just limited to a particular sparse model,meanwhile considers the selection of Gaussian Graphical model.Firstly,the obtained rate of convergence spectrum norm,infinity norm and F norm,is faster than other existing methods,second,through validation analysis,the convex optimization problem can be converted into a linear planning to resolve.And weproceed to simulation analysis on simulated and real dataset in R language,and compare the precision matrix estimation performance and graph recovery performance of CLIME and Glasso method in various models.The results show that the proposed method in the paper has high estimation accuracy,good Gaussian Graphical model recovery performance,low computational cost.Finally,the proposed estimation method is used to analyze the gene expression arrays from leukemia and Arabidopsis thaliana dataset.
Keywords/Search Tags:covariance matrix, Gaussian model, precision matrix, convergence rate, gene dataset, Cluster analysis
PDF Full Text Request
Related items