Font Size: a A A

Globally Spare Probabilistic PCA With Nonignorable Missing Data

Posted on:2022-02-07Degree:MasterType:Thesis
Country:ChinaCandidate:D WangFull Text:PDF
GTID:2518306335954659Subject:Art and design
Abstract/Summary:PDF Full Text Request
With the continuous development and advancement of scientific information technology,the scale of data has shown a rapid growth,and the dimensionality of data has become larger and larger.As the dimensionality of data grows,the lack of data cannot be avoided.The data has both high-dimensional and missing characteristics has brought great challenges to statistical analysis.Therefore,the development of new methods to analyze and process high-dimensional missing data has important theoretical and practical significance.Among them,principal component analysis(PCA)is a widely used data dimensionality reduction and data processing technology,but traditional principal component analysis has no probability model and poor interpretation.In order to overcome its shortcomings,probabilistic principal component analysis and sparse principal component analysis and other methods have been proposed one after another.Among them,Global sparse probability principal component analysis(GSPPCA)is the generalization and development of general sparse principal component analysis.It can also select sparse effective variables,which overcomes the limitation of classical sparse principal component analysis to the difficult interpretation of selected variables when calculating multiple sparse principal components.However,these methods are developed based on the assumption that the data is completely observed.However,in practical applications,people often encounter the phenomenon of missing data.Missing data often constitutes a systematic problem,and can even be an important part of some fields.Therefore,this paper studies the global sparse probability principal component analysis with non-ignorable missing data.Firstly,set the missing data as self-censored MNAR.Using the theorem in GSPPCA,the approximate expression of the explicit expression of the noise-free model under the non-ignorable missing data is derived,and then the inference of the relaxation model of the principal component analysis of the global sparse probability under the non-ignorable missing data is obtained through variational inference.Variational inference algorithm and gradient ascent algorithm are combined for variable selection,and its practicality is demonstrated through numerical simulation and handwritten digital example analysis.
Keywords/Search Tags:Variational Inference, Probabilistic Principal Component Analysis, Global Sparsity, Nonignorable Missing Data
PDF Full Text Request
Related items