Font Size: a A A

Local Impact Analysis Of Principal Component Submers In Principal Component Analysis

Posted on:2017-03-15Degree:MasterType:Thesis
Country:ChinaCandidate:T LiFull Text:PDF
GTID:2209330485450921Subject:Applied statistics
Abstract/Summary:PDF Full Text Request
Principal components analysis(PCA) is a classical technique of dimension reduction, which is extensively applied in practical data analyses. As a result of dimension reduction, the principal components subset, which contains the selected principal components, is generally used for some follow-up data analysis. Hence, the robustness properties of the principal components subset are quite important. As a useful method of statistical diagnostic, the influence analysis has already been extended to the scenario of principal components analysis with several methods proposed for local influence analysis of the single principal components. However, the local influenceanalysis forthe whole principal components subset has not received enough attention. The existing influence analysis for principal components subset mainly focuses on influence function, where the influence of data points is assessed by case-deletion method which may suffers masking effect.In this thesis, an approach of local influence analysis is proposed for the sample principal components subsets. The proposed methodology is built on the basis of a joint perturbation scheme for data points to avoid masking effect among the influential cases. In principal components analysis, the distribution type of the population vector is not necessarily specified. That means the likelihood displacement, a classical concept in the local influence analysis, is not suitable for the principal components subset. On the other hand, the principal components subset is a set containing random variables and that limits the use of many existing approaches of local influence analysis. The proposed approach is based on an appropriate measure of the discrepancy between the sample principal components subset with and without perturbation, which is called principal components subset displacement function and can be viewed as the counterpart of the likelihood displacement in the scenario of principal components analysis. Some concepts in the framework of likelihood displacement, including influence graph, perturbation direction and lifted line are extended to the principal components subset displacement function. In the framework of this displacement, a concept called quasi-curvature is proposed, which is similar to the normal curvature under the likelihood displacement. The perturbation direction maximizing quasi-curvature is used as the influence assessment statistic, called influential direction. For both two types of principal components subsets, including the one from the sample covariance matrix and the other from the sample correlation matrix, the specific expressions for the quasi-curvatures of the lifted lines are obtained and shown to be quadratic forms of the perturbation directions. The influential directions are then obtained from these expressions. A simulated data set is analyzed for illustration of the proposed methodologies.
Keywords/Search Tags:Principal components analysis, Local influenceanalysis, Sample principal components subset, Influence assessment, Influencegraph
PDF Full Text Request
Related items