Font Size: a A A

Research On Variable Selection In High Dimensional Data

Posted on:2018-09-14Degree:MasterType:Thesis
Country:ChinaCandidate:C Y ZengFull Text:PDF
GTID:2417330548474729Subject:Statistics
Abstract/Summary:PDF Full Text Request
With the development of science and technology,information technology has been rapidly developed and widely used.High dimensional data arises at the historic moment.Furthermore,the research of variable selection technology is promoted.In this paper,we use the three variable selection methods:Sparse Principal Component Analysis(SPCA),improved Sure Independence Screening(SIS)and improved Local Linear Embedding(LLE).Firstly,this paper studies the theory of SPCA,the results show that when the feature dimension is less than the sample size,the sparse principal component analysis can be used to select the variables effectively;when the feature dimension is larger than the sample size,the sparse principal component analysis of the variable selection effect is poor.Secondly,this paper improves SIS method,proposes a new screening criterion,weighted absolute Pearson correlation coefficient,and extends the method of safe independent selection to the choice of variables in nonlinear models.Numerical experiments show that the proposed method is more accurate than the traditional method to filter out the real variables.Finally,this paper introduces three variable selection methods based on Manifold Learning:multidimensional scaling,isometric mapping and LLE.We improve LLE method by drawing into entropy weight method,and the linear reconstruction mainly focuses on the characteristic variables which have great influence on the response variables,thus increase the reliability of variable selection.Numerical experiments show that the improved LLE method can significantly improve the effect of variable selection.
Keywords/Search Tags:variable selection, Sparse Principal Component Analysis(SPCA), sure independence selection, weighted absolute Pearson correlation coefficient, manifold learning, local linear embedding
PDF Full Text Request
Related items