Font Size: a A A

Dimension reduction and variable selection in regression

Posted on:2009-01-17Degree:Ph.DType:Thesis
University:Hong Kong Baptist University (Hong Kong)Candidate:Wen, SongqiaoFull Text:PDF
GTID:2440390002992362Subject:Mathematics
Abstract/Summary:PDF Full Text Request
Consider the nonparametric regression of Y against X. When the dimension of X gets higher, the standard techniques such as local smoothing break down quickly because of the sparseness of the data points in any region of interest, a phenomenon often called the "curse of dimensionality". There are essentially two approaches in literature to overcome these difficulties, the first is largely concerned with function approximation and the second is Sufficient Dimension Reduction, example of the former is additive models and example of the latter is Sliced Inverse Regression (Li (1991)).;There are essentially two parts in this thesis, which represent our efforts along the above two approaches. The first part, including chapter 2 and chapter 3, is concerned with dimension reduction with multivariate responses data, the second part, which includes chapter 4, is concerned with variable selection in additive regression models.;Consider the dimension reduction problem where both the response and the predictor are vectors. Existing estimators of this problem take one of the following routes: (1) targeting the part of the dimension reduction space that is related to the conditional mean (or moments) of the response vector, (2) pooling the estimates for the marginal dimension reduction spaces, and (3) estimating the whole dimension reduction space directly by multivariate slicing. However, the first two approaches do not fully recover the dimension reduction space, and the third is hampered by the fact that the accuracy of estimators based on multivariate slicing drops sharply as the dimension of response increases, again the "curse of dimensionality". In chapter 2, we proposed a general method, which is called Projective Resampling method in this thesis, to turn multivariate-response dimension reduction into univariate-response dimension reduction, and thereby avoiding the curse of dimensionality caused by slicing the multivariate response, as required in the direct extensions of the classical methods such as SIR and SAVE. The method is compared with the existing estimators by simulation, and applied to a data set. Chapter 3 represents another effort of mine on this topic. In this chapter, we proposed a new method based on the covariance matrices of predictors and the characteristic function of response variable. The merit of this method is that it can avoid the selection of turning parameters, which is often an important issue, as the performance of many methods, such as SAVE, relies heavily on the choice of slice number.;In chapter 4, we consider the problem of variable selection in additive regression models. The additive model plays an increasing important role in modern statistical analysis. Similar as in the linear regression model, how to select the significant independent variables, and obtain a simple model is also an important issue for the additive modeling. In this chapter we adapt the group variable selection technique developed by Zhou and Zhu (2007), and combine penalized regression spline technique proposed by Mammen and Van de Geer (1997) to perform variable selection and estimation for the additive models. We showed that our proposed method can select significant variables and estimate the nonparametric additive function components with optimal convergence rate simultaneously. Simulation are implemented to investigate performance of the new method.;Keywords: Central Mean Subspace, Central Subspace, Monte Carlo integration, Multivariate nonlinear regression, Sliced Average Variance Estimator, Sliced Inverse Regression; Variable selection, Penalized splines, Additive models, LASSO, Cross-Validation.
Keywords/Search Tags:Regression, Variable selection, Dimension, Additive, Multivariate, Chapter
PDF Full Text Request
Related items