Font Size: a A A

Research On Learning And Application Of Structured Graphical Models

Posted on:2018-02-02Degree:DoctorType:Dissertation
Country:ChinaCandidate:F H HuangFull Text:PDF
GTID:1360330596450662Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Structure learning of undirected probabilistic graphical models is one of important research topics in machine learning.In recent years,when learning structure of graphical models under the high dimensional settings,we need to use some prior knowledge such as sparse or low rank structures to reach consistency of the estimated models.With the rapid development of data acquisition technology,we also frequently obtain many heterogeneous and dynamic data such as the gene expression data collected from different tissues and the temporal financial data,and these data have some specific structures such as matrixvariate or response-variate.Clearly,the existing graphical models can not well competent to modeling these complex data.In the paper,thus,we propose some novel structured graphical models to modeling the conditional dependence relationships of these heterogeneous or dynamic complex data.Moreover,we propose some methods to estimate the proposed graphical models and establish the asymptotic properties of these models under the high-dimensional settings.With respect to applications of structural graphical models,we study a class of non-convex robust graph-guided models to improve performances of both classification and regression tasks by incorporating estimated structures of features.Moreover,we propose two classes of non-convex stochastic variance-reduced alternating direction method of multipliers(ADMM)to solving these large-scale graph-guided models.In summary,the main contributions of the paper are given as follows:(1)We propose a class of joint matrix-variate graphical models to learn the conditional dependence relationships of the heterogeneous matrix-variate data.Recently,the existing graphical models mainly build on the vector-variate data.In fact,the matrix-variate data are also frequently present in many applications such as image data and financial data.If the vector-variate graphical models are established by using vectorization of matrix-variate data,these vector-variate models will be difficult to exhibit good performances due to ignoring the row and column structure information in the matrix-variate data.Thus,we propose a class of joint matrix-variate Gaussian graphical models by directly using the matrix-variate data,which builds on the matrix-variate normal distribution assumption.Specifically,we use the structured regularization maximum likelihood method to estimate our model,and design an effective alternating iteration algorithm to solve our method.Moreover,we establish the asymptotical properties of our model on consistency and sparsistency under the high-dimensional settings.In particular,our matrix-variate model has a better convergence rate than the corresponding vector models,i.e.,our model has less sample complexity than the vector-variate models.Finally,the extensive experiments on some artificial data and a real gene expression dataset demonstrate the effectiveness of our joint matrix-variate model.(2)We propose a class of joint conditional graphical models to model the conditional dependence relationships of the heterogeneous response-variable data.Recently,the existing graphical models mainly model the 'clean' variables without affected by some covariates,while it can not well be competent to the data affected by some covariates,which are frequently present in some applications such as the expression quantitative trait loci(eQTL)data.Thus,based on the multivariate conditional normal distribution,we propose a class of joint conditional Gaussian graphical model for the heterogeneous response-variable data by using the convex regularization maximum likelihood method.We develop an efficient approximated Newton method to solve our model,and provide a screening technique to speed up our algorithm.Moreover,we establish the asymptotic properties of our model on the consistency and sparsistency under the high-dimensional settings.In particular,our model is able to model multiple multivariate linear regressions by a convex formulation.Finally,extensive numerical results on simu-lations and real datasets demonstrate that our method outperforms the compared methods on structure recovery and structured output prediction.(3)We propose a class of dynamic conditional graphical models to model the dynamic conditional dependence relationships of the response-variate data.Recently,the existing graphical models mainly focus on modeling static conditional dependence relationships of random variables.In many real systems,however,these relationships often vary with time or some conditions such as the gene regular network varying with cell division.To characterize the dynamic relationships of response variables,based on the conditional normal distribution,we propose a class of the dynamic conditional Gaussian graphical models(DCGGMs),and propose a joint smooth graphical Lasso to estimate the DCGGMs,by combining the kernel smoothing technique with convex structure regularization maximum likelihood method.Then we provide an effective accelerated proximal gradient algorithm to solve our method.Moreover,we establish the asymptotical properties of our model on the consistency and sparsistency under the high-dimensional settings.In particular,we highlight a class of consistency theory for dynamic graphical models,in which the sample size can be seen as n4/5 for estimating a local graphical model when the bandwidth parameter h of kernel smooth is chosen as h(?)n-1/5 for describing the dynamics.Finally,the extensive numerical experiments on both synthetic and real datasets are provided to support the effectiveness of our model.(4)We propose a class of joint dynamic semi-parameter probability graphical models to learn the conditional dependence relationships of non-normal distribution heterogeneous data.Moreover,we propose a semi-parametric fused graphical Lasso to estimate our model by combining non-parametric rankbased correlation matrix estimator with convex structure regularization maximum likelihood method.In particular,we propose a novel kernel smoothing Kendall's tau correlation matrix to estimate the dynamic graphical models.Due to relaxing the normal distribution assumption and using non-parametric rank-based correlation matrix estimator,our models are more flexible and robust than the existing Gaussian graphical models.Moreover,we use an efficient multi-block ADMM to solve our method.Finally,some numerical results on simulations and real datasets such as brain imaging data and stock trading data demonstrate the effectiveness of our models.(5)We propose two classes of the nonconvex stochastic ADMMs with variance reduction for solving large-scale robust graph-guided models,which incorporate estimated graphical structures from data to the classification and regression tasks.Specifically,the first class method called the nonconvex stochastic variance reduced gradient ADMM(SVRG-ADMM),uses a multi-stage scheme to progres-sively reduce variance of the stochastic gradients.The second called the nonconvex stochastic average gradient ADMM(SAGA-ADMM),uses additionally the old gradients estimated in the previous iteration to reduce variance.Moreover,under some mild conditions,we establish the iteration complexity bound of O(1/?)of the proposed methods to obtain an ?-stationary solution of the nonconvex problems.In particular,we provide a general framework to analyze the iteration complexity of these nonconvex stochastic ADMMs with variance reduction.Finally,some numerical experiments demonstrate the effectiveness of our methods.
Keywords/Search Tags:high dimensional data, matrix-variate Gaussian graphical models, conditional Gaussian graphical models, dynamic graphical models, semi-parameter graphical models, joint learning, kernel smoothing technique, robust graph-guided models
PDF Full Text Request
Related items