Font Size: a A A

Research On Penalized Maximum Likelihood Estimation For Gaussian Graphical Mixture Model

Posted on:2022-10-25Degree:MasterType:Thesis
Country:ChinaCandidate:X HanFull Text:PDF
GTID:2480306311464104Subject:Probability theory and mathematical statistics
Abstract/Summary:PDF Full Text Request
The Gaussian mixture model assumes that the data come from multiple Gaus-sian distributions.The Gaussian mixture model can approximate any complex data very well,as long as the model contains enough number of components,which has great flexibility.However,when modeling high-dimensional data,esti-mating square magnitude parameters is extremely challenging..Benefitting from the development of sparse statistical learning theory,high-dimensional data min-ing is usually carried out under the assumption of sparsity.Specific to Gaussian mixture model,focusing on the sparsity of the precision matrix(the inverse of the covariance matrix),some researchers propose a penalized maximum likelihood estimation method GMLasso,which imposes L1-type penalty on the precision matrix of each mixture component.The estimate can be computed using the Expected Maximization algorithm which involves the Graphical Lasso.The pe-nalized maximum likelihood estimation method is often used to explore the sparse structure of Gaussian graphical model.From the perspective of Gaussian graph-ical model theory,GMLasso regards each mixed component as a sparse Gaussian graphical model,which represents the traditional Gaussian mixture model using the undirected graph and extends the Gaussian graphical model to the Gaus-sian graphical mixture model.In addition,this penalized maximum likelihood estimation can also avoid the problem of likelihood degradation of the maximum likelihood estimation of the Gaussian mixture model.Studies on the learning theory of Gaussian graphical model show that the penalized maximum likelihood estimation based on L1-type penalty is biased,while based on Adaptive Lasso penalty or non-convex penalty SCAD can not only attenuate the bias but also ensure the sparsity.In view of the bias of GMLasso and based on above result,Adaptive Lasso penalty and non-convex penalty SCAD were used in this paper to replace L1-type penalty respectively to improve GMLasso,and two penalized maximum likelihood estimation methods GMAdaLasso and GMSCAD were obtained.Details of solving the proposed estimate stated as follows:the Expectation Maximization algorithm is used to solve the maximization problem of the penalized likelihood function,and the Q function is defined as the conditional expectation of the penalized logarithmic likelihood function with respect to the posterior distribution of latent variable.When the Q function is maximized in the m-step,the updating of the covariance matrix needs to use the penalized likelihood learning theory of Gaussian graphical model for example.In the simulation study,three bivariate Gaussian graphical mixture models with different degree of sparsity of mixed components were designed to compare the errors of the covariance matrix estimated by GMLasso,GMAdaLasso and GMSCAD.The simulation results show that,the error estimated by GMAdaLas-so and GMSCAD is lower than that estimated by GMLasso on the whole,and with the increase of data dimensions,the error decreases more.The three meth-ods were applied to two high-dimensional image datasets for clustering analysis,and the clustering performance was evaluated by the normalized mutual informa-tion.The results shows that compared with GMLasso,the clustering performance of GMAdaLasso and GMSCAD are improved by up to 30%on COIL20 image dataset and about 9%on USPS handwritten digital image dataset.
Keywords/Search Tags:Gaussian Graphical Model, Gaussian Mixture Model, Penalized Maximum Likelihood Estimation, Adaptive Lasso, SCAD
PDF Full Text Request
Related items