Font Size: a A A

Analysis Of The Effect Of Full Dimensional Reduction Method Based On Distribution Weighted

Posted on:2015-02-12Degree:MasterType:Thesis
Country:ChinaCandidate:L ZhuFull Text:PDF
GTID:2207330422967879Subject:Statistics
Abstract/Summary:PDF Full Text Request
Regression is an important statistical issue that makes the inference on thedistribution of the response given predictors. When a lot of predictors exist, fitting therelationship between the response and all the predictors may result in “dimensioncurse”. In many cases, the response is only related to several linear combinations ofthe predictors, that is, given these combinations, the response will be conditionallyindependent of the predictors. If these linear combinations can be found, then we canregress the response on these combinations, and the problem of high-dimensionalpredictors will be solved. The task of sufficient dimension reduction (SDR) is exactlyto find out these linear combinations without any parametric model. Recently, as thedimension and amount of data produced from various fields keep increasing, the issueof dimension reduction has received a lot of attention with SDR one of the focuses instatistics. As SDR is a crucial stage of high-dimensional nonparametric regressionwith its result the foundation of next analysis, its robustness is quite important in themodeling. Hence the influence analysis of SDR seems very necessary. Influenceanalysis is an important part of statistical diagnostic theory and it focuses on thesensitivity of the inference result with respect to the input of the model. Influenceanalysis in SDR is to assess the robustness of the dimension reduction (DR) methodsby searching the aspects, say data points, of the model which produce much strongerinfluence on the inference results than the other aspects. In a sense, it helps us tojudge whether the result of dimension reduction is reliable or not. However, as theestimating result of SDR is a vector space, existing methods of influence analysiscannot be applied to SDR directly. In this article, influence analysis is conducted forthe distribution-weighted partial least square estimate in single index model and thecumulative slicing estimation in the multiple index model with both case-deletion andlocal influence assessment methods developed. By these methods, influential datapoints, especially the ones with special effects such as masking effects, can bedetected. The main theoretical results obtained in this article can be listed as follows:1. For the influence analysis of the distribution-weighted partial least square estimateand cumulative slicing estimation, a space displacement function is constructed forthe measurement of the difference between the perturbed an unperturbed DR space estimates, based on canonical trace correlation coefficient proposed by Hooper(1959). This measurement of difference takes the covariance structure of thepredictors and the statistical meaning of the DR space into account.2. On the basis ofthe above space displacement function, a concept named “quasi-curvature” isproposed for assessment of the local influence produced by perturbation on the DRspace, together with the method for obtaining the perturbation direction maximizingthis quasi-curvature. This maximizing perturbation direction, after standardization,can be used as a statistics for influence assessment. The above approaches can beviewed as a generalization of the methodologies based on likelihood displacementfunction and normal curvature proposed by Cook (1986). A simulation studyillustrates the efficiency of our approaches for detection of the influential data points.
Keywords/Search Tags:sufficient dimension reduction, distribution-weighted estimate, cumulative slicing estimation, case-deletion, local influence analysis
PDF Full Text Request
Related items