Font Size: a A A

Supervised Dimension Reduction Of Multiple Compositional Data In Simplex Space

Posted on:2022-01-24Degree:MasterType:Thesis
Country:ChinaCandidate:Y M GanFull Text:PDF
GTID:2480306479951349Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
With the popularity of big data,people are faced with increasingly complex data.High-dimensional complex data can often be collected.Therefore,statistical analysis often faces the problem of the curse of dimension.In order to overcome the curse of dimension and mine the useful information of data fully,the method of sufficient dimension reduction has been paid more and more attention.Compositional data conveys the information of proportion,and describes the structure characteristics of a whole,it is mainly used for system development trend forecast.It exists in many fields widely,however,its unique characteristics of data-the constraint of "constant-sum" causes traditional statistical methods cannot be used directly,and dimension reduction method for compositional data which exists is limited.In view of this,Compositional Data Directional Regression(Co DR)Dimension Reduction method,a new supervised dimension reduction method for compositional data,is proposed in this paper.Firstly,we generalize the theoretical framework of the classical Directional Regression(DR)method,and give the specific form of the central dimension reduction subspace under the multiple compositional data,and give the specific algorithm of the directional regression dimension reduction method under the multiple compositional data by introducing Aitchison geometry and isometric logratio transformation.This method can reduce the dimension of multiple compositional data quickly in simplex space without losing the regression information of dependent variables contained in the independent variables of multiple compositional.In order to evaluate the performance of the proposed Co DR method,we have done a lot of numerical simulation,considering the various distribution of compositional data,and a variety of diverse compositional variables of the internal and external related structure and intensity,and the dimension reduction of different structure dimension of space.What's more,the proposed method is compared with the existing Compositional Data Sliced Inverse Regression(Co SIR)method and Compositional Data Sliced Average Variance Estimation(Co SAVE)method,simulation results show that the estimate of center dimension reduction subspace of Co DR method with smaller estimation error,and combining the advantages of Co SIR and Co SAVE method.It can not only estimate the effective dimension reduction direction in the case of even function,but also have a good dimension reduction effect when the sample size is small.On the other hand,in practice,the dimension of structure is usually unknown,so we propose a BIC type information criterion to determine the dimension of structure adaptively according to the data.Numerical simulation shows that with the increase of the sample size,the accuracy of the estimation is higher and higher,and the consistency of the algorithm is proved numerically.Finally,this paper applies the proposed Co DR method to the data of urban residents' disposable income,studies the influences of regional GDP,total fixed asset investment,urban unit employment and urban unit employment wage on urban residents' disposable income,and establishes a regression model for predicting and analysing.Based on dimension reduction Co DR method,we got compositional covariates,and partial linear model is established.And two dimension reduction methods with Co SIR and Co SAVE prediction effect were analyzed,based on 200 times of cross validation experiments show that the dimension reduction method based on the proposed Co DR looking for direction to establish the model has a smaller prediction error,embodies the proposed dimension reduction method in data analysis,the rationality and superiority in composition.
Keywords/Search Tags:Compositional Data, Sufficient Dimension Reduction, Compositional Data Directional Regression Dimension Reduction Method, BIC Type Information Criterion, Urban Residents' Disposable Income
PDF Full Text Request
Related items