Font Size: a A A

Sparse Principal Components Analysis With Elastic Net

Posted on:2007-09-06Degree:MasterType:Thesis
Country:ChinaCandidate:Y J ZhangFull Text:PDF
GTID:2120360242460907Subject:Probability theory and mathematical statistics
Abstract/Summary:PDF Full Text Request
Principal component analysis (PCA) is a popular data processing and dimension reduction technique. As an un-supervised learning method, PCA has numerous application such as handwritten zip code classification and human face recognition .Recently PCA has been used in gene expression data analysis .Hastie et al. propose the so-called Gene Shaving techniques using PCA to cluster high variable and coherent genes in microarray data.The success of PCA is due to the following two important optimal properties. On the one hand ,principal components sequentially capture the maximum variability among X, thus guaranteeing minimal information loss. On the other hand, principal components are uncorrelated, so we can talk about one principal component without referring to others.We feel it is desirable not only to achieve the dimensionality reduction but also to reduce the size of explicitly variables. An ad hoc way is to artificially set the loadings with absolute values smaller than a threshold to zero. Jolliffe&Uddin introduced SCoTLASS to get modified principal components with possible zero loadings.Recall the same interpretation issue arising in multiple linear regression, where the response is predicted by a linear combination of the predictors. Interpretable models are obtained via variable selection. The lasso is a promising variable selection technique, simultaneously producing accurate and sparse models. Zou & Hastie propose the elastic net, a generalization of the lasso, to further improve upon the lasso. In this paper we introduce a new approach to get modified PCs with sparse loadings, which we call sparse principal components analysis with the elastic net(SPCA--TEN). This method is built on the fact that PCA can be written as a regression-type optimization problem, thus the elastic can be directly integrated into the regression criterion such that the resulting modified PCA produces sparse loadings.
Keywords/Search Tags:Ridge Regression, Lasso Regression, Pca, Spca—Ten, Elastic Net, L1, L2
PDF Full Text Request
Related items