Font Size: a A A

Smoothed Generalized Empirical Likelihood And Elliptical Sliced Inverse Regression In High-dimensional Data

Posted on:2020-09-08Degree:DoctorType:Dissertation
Country:ChinaCandidate:J ZhangFull Text:PDF
GTID:1529305894962269Subject:Statistics
Abstract/Summary:PDF Full Text Request
In the past ten years,with the rapid advancement of science and technology,especially the development of computer technology,the data collection,storage and computing capabilities have been greatly improved,and highdimensional data has been seen everywhere in real life.The need to analyze high-dimensional data stems from a variety of real-world problems,such as panel research in economics,social natural phenomena such as climate,genetic analysis,and communication engineering and so on.As the ability to collect and store data continues to grow,industrial,commercial,and government sectors are faced with the task of analyzing high-dimensional data,and they hope to understand the connections between different factors and use relevant information to improve forecasting accuracy and reduce decision-making risk.The dramatic increase in data dimensions has brought unprecedented challenges to traditional statistical and econometric models.That is,the existing estimation methods and test methods are no longer applicable in the setting of high dimensional data.How to extract useful information from quantities of information to construct models and to make decisions has become an urgent problem that needs to be solved in various industries.In short,the research in this field mainly focuses on the following two aspects:(1)estimation,inference and test of high-dimensional models,and(2)dimension reduction of high-dimensional data.This study has contributed to both directions.(1)We firstly propose a penalized generalized empirical likelihood approach based on the smoothed moment functions(Smith,1997,2001;Anatolyev,2005)for parameters estimation and variable selection in the growing(high)dimensional weakly dependent time series setting.The dimensions of the parameters and moment restrictions are both allowed to grow with the sample size at some moderate rates.The asymptotic properties of the estimators of the smoothed generalized empirical likelihood and its penalized version are then obtained by properly restricting the degree of data dependence.It is shown that the smoothed penalized generalized empirical likelihood estimator maintains the oracle property despite the existence of data dependence and growing(high)dimensionality.We finally present simulation results and a real data analysis to illustrate the finite-sample performance and applicability of our proposed method.(2)For dimension reduction,sliced inverse regression(SIR)is the most widely-used sufficient dimension reduction method due to its simplicity,generality and computational efficiency.However,when the distribution of the covariates deviates from the multivariate normal distribution,the estimation efficiency of SIR gets rather low.In this paper,we propose a robust alternative to SIR-called elliptical sliced inverse regression for analyzing high dimensional,elliptically distributed data.There are wide applications of the elliptically distributed data,especially in finance and economics where the distribution of the data is often heavy-tailed.To tackle the heavy-tailed elliptically distributed covariates,we novelly utilize the multivariate Kendall’s tau matrix in a framework of so-called generalized eigenvector problem for sufficient dimension reduction.Methodologically,we present a practical algorithm for our method.Theoretically,we investigate the asymptotic behavior of the elliptical sliced inverse regression estimator under high-dimensional setting.Quantities of simulation results show that elliptical sliced inverse regression significantly improves the estimation efficiency in heavy-tailed scenarios.Analysis of two real data sets also demonstrates the effectiveness of our method.Moreover,the idea of elliptical sliced inverse regression can be easily extended to most other sufficient dimension reduction methods and applied to non-elliptical heavytailed distributions.In summary,this paper considers how to use the smoothed generalized empirical likelihood method to estimate the unknown parameters in the estimating equation models in high-dimensional time series and how to use the elliptical sliced inverse regression method to do sufficient dimension reduction in non-Gaussian distributed data.Because the non-parametric model framework considered in this paper can be used to deal with many practical problems,the paper fully reflects the characteristics of statistics to solve practical problems and to service society.What is more,this paper has important theoretical and practical value.The innovations of this paper are mainly reflected in the following two aspects:(1)Methodologically,the contribution of this paper is reflected in two aspects.On one hand,based on the estimating equation model,this paper proposes a smoothed penalized generalized empirical likelihood method for weakly dependent time series.This method performs well in numerical simulations and real data.On the other hand,for the non-Gaussian distributed data,this paper proposes a robust sufficient dimension reduction method–elliptical sliced inverse regression method.The algorithm is simple and practical,and can achieve sufficient dimension reduction for a large number of heavy-tailed data.In addition,the method has good scalability.(2)Theoretically,this paper firstly derives the consistency and asymptotic normality of the smoothed generalized empirical likelihood estimator and the smoothed penalized generalized empirical likelihood estimator when the parameter dimension and the number of estimating equations increase with the sample size in the estimating equation model.The oracle property of the smoothed penalized generalized empirical likelihood estimator is also obtained.secondly,the theoretical proof of the rationality of elliptical sliced inverse regression method is given,and the consistency and convergence rate of the proposed estimator for the central subspace are given in the high-dimensional background.
Keywords/Search Tags:High-dimensional data analysis, Penalized empirical likelihood, Weak dependence, Sufficient dimension reduction, Elliptical sliced inverse regression
PDF Full Text Request
Related items