Sparse generalized PCA and dependency learning for large-scale applications beyond Gaussianity

Posted on:2017-05-04

Degree:Ph.D

Type:Dissertation

University:The Florida State University

Candidate:Zhang, Qiaoya

Full Text:PDF

GTID:1458390005491756

Subject:Statistics

Abstract/Summary:

PDF Full Text Request

The age of big data has re-invited much interest in dimension reduction. How to cope with high-dimensional data remains a difficult problem in statistical learning. In this study, we consider the task of dimension reduction---projecting data into a lower-rank subspace while preserving maximal information. We investigate the pitfalls of classical PCA, and propose a set of algorithm that functions under high dimension, extends to all exponential family distributions, performs feature selection at the mean time, and takes missing value into consideration.;Based upon the best performing one, we develop the SG-PCA algorithm. With acceleration techniques and a progressive screening scheme, it demonstrates superior scalability and accuracy compared to existing methods.;Concerned with the independence assumption of dimension reduction techniques, we propose a novel framework, the Generalized Indirect Dependency Learning (GIDL), to learn and incorporate association structure in multivariate statistical analysis. Without constraints on the particular distribution of the data, GIDL takes any pre-specified smooth loss function and is able to both extract and infuse its association into the regression, classification or dimension reduction problem. Experiments at the end serve to demonstrate its efficacy.

Keywords/Search Tags:

Dimension reduction, Data

PDF Full Text Request

Related items

1	Dimension Reduction Of Industrial Monitioring Data And Its Application
2	Research On Dimension Reduction Methods For High-dimensional Complex Data
3	Research On Dimension Reduction Methods Of High Dimensional Data
4	Research Of Manifold Learning In Data Dimension Reduction And Classification
5	Cluster analysis of high dimensional data and dimension reduction for regression
6	Dimension reduction algorithms in data mining, with applications
7	A New RKHS-based Approach To Nonlinear Dimension Reduction For Survival Data
8	Research On Methods Of Complex Simulation Data Dimension Reduction And Visualization Clustering
9	A New RKHS-based Approach To Nonlinear Kernel Dimension Reduction
10	Research On Dimension Reduction Algorithms For Preserving Clustering Structures