Font Size: a A A

Mining Nonlinear Interactions In High-Dimensional Data

Posted on:2019-06-09Degree:MasterType:Thesis
Country:ChinaCandidate:Y L QinFull Text:PDF
GTID:2347330569479760Subject:Statistics
Abstract/Summary:PDF Full Text Request
For ultra-high dimensional data,most existing works for variable screening focus on linear regression models.However,in complex situations,the relationship between the response and the predictors is nonlinear,and simple linear models are not flexible enough to capture the underlying model structure.This work considers the problem of non-linear variable selection for a complex class of nonparametric additive model with sparse ultra-high-dimension.The statistical problem is to determine which additive component is nonzero.We propose the methods of main effects and interact effects selection in nonparametric additive model.In the first section,we use the Nonparametric Forward Selection(NFS)to select the nonzero component because of the simplicity and efficiency,the theoretical analysis in the article reveals that the proposed method have a sure screening property under some mild technical conditions,even if the predictor dimension is significantly larger than the sample size.The simulation results and real data analysis demonstrates that the procedure works well with moderate sample size and large dimension and performs better than competing methods.In the following nonparametric data analysis,it is extremely challenging to identify important interaction effects.In the linear model,interaction effects have heavier tails and more complex covariance structures than main effects in a random design,making theoretical analysis difficult.The same is true in nonparametric model.A top concern in practice for nonparametric learning is computational feasibility.In the second section,we propose two forward-selection-based methods for interac-tion screening in nonparametric additive models,called iNFORT(interaction-selection nonparametric procedures featured with forward selection).The iNFORT procedure is designed to be simple to implement,it only involves OLS-type(Ordinary Least Square)calculations and don't need compels optimization tools;the algorithms avoids storing and manipulating the whole augmented matrix,so the memory and CPU(Central Pro-cessing Unit)requirement is minimal.The simulation results and a real data analysis demonstrate that the proposed procedure works well with moderate sample size and large dimension for interaction screening in nonparametric additive models.In further research,we extend the proposed method to generalized form in nonpara-metric additive models.The good results has some help for the classification problems of nonparametric model.The full text is divided into six chapters,as follows:In chapter 1,introduce the previous research and the comparison of the pros and cons of the predecessors,and give the research ideas and research contents of this paper.In chapter 2,introduce the basic concepts related to this article,briefly introduces the Forward Regression(FR)method of Wang(2009),the nonparametric additive mod-el and components,and sure screening property of variable selection.In chapter 3,extend the FR method in the linear model to the non-parametric additive method,study the theoretical proof,the simulation data and the real case in detail.In chapter 4,consider the influence of the interaction in the nonparametric additive model,the two-step method based on the FR is given to select the interaction effects of this model.The feasibility of the algorithm and the consistency of the variable selection are ensured from the theoretical aspects.In chapter 5,give the generalized form of the nonparametric additive model,ana-lyze the simulation of the main effects selection.In chapter 6,summarize the full text and give questions about further research.
Keywords/Search Tags:nonparametric additive model, forward selection, main effects selection, interaction effects selection, sure screening property
PDF Full Text Request
Related items