Font Size: a A A

Robust Estimation And Variable Selection For Some Semiparametric Models

Posted on:2016-09-07Degree:DoctorType:Dissertation
Country:ChinaCandidate:J LvFull Text:PDF
GTID:1220330503452372Subject:Statistics
Abstract/Summary:PDF Full Text Request
Semiparametric regression models not only possess the flexibility of nonparametric regression models, but also retain the merits of good explanatory ability of parametric regression models. Thus they have received widespread attention by many scholars and have a wide range of applications in economic, biological, medical studies and other fields. In this thesis, our studies mainly focus on three kinds of semiparametric models: varying coefficient models, partially linear additive models and varying index coefficient models. Most existing estimation procedures are built on least squares or likelihood function. It is well known that thay are not robust approaches. This is because the common least squares or likelihood related methods are expected to be sensitive to outliers or heavy tailed distribution, and thus the estimation efficiency may be greatly decreased. Even worse, least squares estimator is not consistent when the second order moment of the random error does not exist. Thus, this motivates us to look for some robust approaches from different aspects. On the other hand, there is no doubt that variable selection is a basic and important work. A good statistical model should only contain these covariates which are truly related to the response variable. Only in this way can we yield a simpler model and produce a better accuracy of prediction. Therefore, the purpose of our thesis is to propose a series of robust estimations and variable selection methods for varying coefficient models, partially linear additive models and varying index coefficient models. Specifically, the research contents of this thesis contain the following three parts.The first part studies the robust estimations and variable selection about varying coefficient models. Chapter 2 proposes a robust and efficient unified variable selection approach for varying coefficient models by utilizing the B spline basis function approximation, double SCAD penalty functions and rank regression. Without any prior information, the proposed method not only can select important variables but also distinguish the varying coefficient effect and constant effect variables. Under some suitable conditions, we prove that the proposed method possesses the consistency in both variable selection and the separation of varying and constant coefficients, and the nonzero parametric estimators enjoy the oracle property. Finally, simulated examples and a real data analysis are used to confirm the robustness and efficiency of the proposed approach. Note that varying coefficient models in the chapter 2 cannot deal with discrete response variables. Thus, chapter 3 studies more flexible generalized varying-coefficient partially linear models, which can allow for the non-Gaussian data and nonlinear link functions. Within the framework of generalized varying coefficient partially linear models, we construct a new robust estimation equation by utilizing the exponential score function and weight function. The new estimators not only can overcome the outliers in both the response and covariates but also possess good efficiency when the turning parameter is selected appropriately. Furthermore, we propose a robust variable selection procedure for the parameter part based on the smooth threshold estimating equations proposed by Ueki(2009). Under some reasonable conditions, the proposed estimator has oracle property. In addition, based on the idea of Newton-Raphson, we give an iterative algorithm which can obtain the numerical solutions of new robust estimation equations, and we also discuss how to choose a series of turning parameters involved the estimation equations. Simulated examples and real data application have been used to verify the superiorities of the proposed method.In the second part, we study the robust estimation and variable selection about partially linear additive models. Chapter 4 proposes a robust variable selection method based on the B spline basis function approximation, double SCAD penalty functions and modal regression. Under some suitable conditions, the proposed method is consistency in both parametric and nonparametric variable selection, nonparametric estimations achieve the optimal convergence rate and nonzero parametric estimations have oracle properties. Meanwhile, we give an estimation algorithm to solve the penalized estimators based on the EM algorithm and the local quadratic approximation. Simulated examples and a real data analysis show that the proposed estimator is robust and has great superiority compared with some existing methods. Chapter 5 studies partially linear additive models with longitudinal data. Within the framework of quantile regression, we construct new estimation functions based on the working correlation matrix. The main advantage of the new method is that it not only incorporates the correlation within subjects but also possesses robustness. But the objective function is non-convex, noncontinuous and not differentiable. To overcome these difficulties, we apply the induced smoothing method proposed by Brown and Wang(2005) to obtain the numerical solutions of the proposed estimation equations. In addition, we construct robust smooth threshold generalized estimating equations to carry out variable selection. Under some suitable conditions, we demonstrate the proposed estimator has oracle property. Simulated examples and a real data application confirm the superiority of the proposed method.The third part studies the robust estimations about varying index coefficient models.The varying index coefficient model is a very flexible model which includes many common semiparametric models such as the varying coefficient model, the varying coefficient partially linear model, the additive model, the partially linear additive model and so on. Chapter 6 uses the B spline basis function approximation and modal regression to construct a new robust estimation procedure for varying index coefficient models. Thus, researches of this chapter can be seen as extensions of the fourth chapter. In theory, we prove the large sample properties of the proposed estimations including the consistency and asymptotic normality of estimation. Meanwhile, we give an estimation algorithm by combining EM algorithm and Fisher’s score method. Simulated studies and a real data analysis show that the proposed estimator performs well.
Keywords/Search Tags:Semiparametric regression models, Robust estimation, B spline, Variable selection, Oracle property
PDF Full Text Request
Related items