Font Size: a A A

Semiparametric Efficient Estimation In High Dimensional Regression Models

Posted on:2024-07-16Degree:DoctorType:Dissertation
Country:ChinaCandidate:X Y FuFull Text:PDF
GTID:1520307307994829Subject:Financial statistics and risk management
Abstract/Summary:PDF Full Text Request
High-dimensional regression models have been extensively studied in the litera-ture.However,in the case of unknown error distribution,how to incorporate efficiency into high-dimensional estimation is still an unsolved and challenging problem.Estima-tion based on penalized least squares will lead to efficiency loss when the error distri-bution is non-Gaussian,while maximum likelihood based estimation cannot be directly applied due to unknown error density.Although quantile-based regression can be used to solve the problem of efficiency loss,the asymptotic variance of these estimates does not reach the lower bound of semiparametric efficiency.So how to obtain sparse effi-cient estimator under unknown error distribution is the main problem to tackle in this thesis.In this thesis,we study variable selection under the condition of unknown error distribution based on three types of high-dimensional regression models,linear model,partial linear model and varying-coefficient partial linear model.Based on the penalized estimation equation,we construct a series of sparse esti-mators for three types of high-dimensional regression.We prove their semiparametric efficiency and the oracle property.The main idea of constructing the semiparametric efficient estimator is to nonparametrically approximate the penalized likelihood estima-tion equation based on the error distribution of the model,which has the same asymp-totic property as the penalized likelihood estimator.We propose an improved EM al-gorithm to solve this kind of semiparametric efficient estimator.The nonparametric approximation of the error term is conducted by the kernel density estimation and the estimation for nonparametric part of the partial linear model is obtained by local linear kernel density estimation,and HBIC method is used to select the tuning parameter of the penalty term.In chapter 3,we propose a novel sparse semiparametric efficient estimation method for high-dimensional linear regression with unknown error density via penalized esti-mating equations.We prove that the new estimator is asymptotically as efficient as the oracle MLE in the ultra-high-dimensional setting with unknown error density,and thus is more efficient than the traditional penalized least squares estimator for non-Gaussian error densities.In addition,we demonstrate that several popularly used high dimen-sional regression estimators are special cases of ours.Extensive simulation studies and empirical analysis of a real data set are conducted to demonstrate the effectiveness of the proposed procedure and its superior performance compared to least squares based methods.In chapter 4,we introduce a novel semiparametric efficient estimation procedure for high-dimensional varying-coefficient partial linear regression models to overcome the challenge of efficiency loss of the traditional least-squares based estimation proce-dure under unknown error distributions,while enjoying several appealing theoretical properties.The new estimation procedure provides a sparse estimator for the para-metric component and achieves the semiparametric efficiency as the oracle MLE as if the error distribution was known.By employing the penalized estimation and the semiparametric efficiency theory for ultra-high-dimensional varying-coefficient partial linear model,the procedure enjoys the oracle variable selection property and achieves efficiency gain for non-Gaussian random errors,while maintaining the same efficiency as the least squares based estimator for Gaussian random errors.Extensive simulation studies and an empirical application is conducted to demonstrate the effectiveness of the proposed procedure.In chapter 5,we propose a new variable estimation method by employing the kernel density estimation on penalized likelihood estimation equation and prove the semipara-metric efficiency and oracle property of the proposed estimator.The corresponding algorithm is provided.We also conducted intensive simulation experiments to illustrate the prior of the proposed estimator to the profile least square estimator.A real dataset is analyzed to show the effectiveness of our proposed estimator.The innovation of this article is reflected in the following three aspects:Firstly,this thesis proposes a series of semiparametric efficient sparse estimations for three types of regression models(linear regression,partial linear regression and par-tial linear variable coefficient regression)with ultra-high-dimensional covariates.The proposed method has oracle property,as if the error density is known.Then,this thesis proposes a refined semi-parameter efficient estimation method,which is simple to calculate.When the unknown error distribution is non-Gaussian,it has significant efficiency gains compared with the penalized profile least squares esti-mation.Lastly,in this thesis,the least squares estimator and the asymptotic properties based on likelihood estimator of partial linear regression model and varying-coefficient partial linear regression model with ultra-high-dimensional covariates are established,which are lack of in-depth research in the literature.
Keywords/Search Tags:Semiparametric estimator, Estimating equations, High dimension estimation, Asymptotically efficient, Partial linear model, Varying-coefficient partial linear model
PDF Full Text Request
Related items