Font Size: a A A

Model Averaging For Two Classes Of Semiparametric Regression Models

Posted on:2022-05-13Degree:DoctorType:Dissertation
Country:ChinaCandidate:G Z HuFull Text:PDF
GTID:1487306764495434Subject:Vocational Education
Abstract/Summary:PDF Full Text Request
Model averaging is a hot topic in statistical research in recent years.It mainly uses certain weights to combine estimators or predictions from different models.By selecting appropriate weights,the accuracy of estimation or prediction can be effectively improved.Model averaging can be divided into Bayesian model averaging(BMA)and Frequentist model averaging(FMA).Since BMA needs to determine the prior probability of each model,more and more scholars focus on FMA.The research on FMA mainly focuses on two aspects.One is to explore the selection of optimal weights in the model averaging estimators;the other is to discuss the limiting distribution of the model averaging estimators,and then construct the theory of statistical inference.Semiparametric regression models are developed on the basis of linear models and nonparametric models.In recent years,they have received extensive attention from statisticians.This dissertation mainly explores model averaging for two types of semiparametric models(that is,partially linear varying coefficient models and partially linear models).It includes not only the exploration of the asymptotic optimality of the model averaging estimators,but also the discussion of the asymptotic distribution of the model averaging estimators.The specific research contents of this dissertation include the following four parts:(1)For the semiparametric partially linear varying coefficient model,the general series method is used to estimate the unknown parameters in the model,and the model averaging estimator for the mean of the response variable is derived.Then we use a "leave-one-out" cross validation(CV)method to select weights in the model averaging estimator,and the jackknife model averaging(JMA)estimator is deduced.In order to facilitate the calculation and theoretical research,a shortcut formula for calculating the "leave-one-out" estimator of the mean of the response variable is constructed.Under some regular conditions,the asymptotic optimality of the model averaging estimator is proved.Simulation studies not only show that the explored JMA estimator is better than the traditional model selection and model averaging estimators,but also verify the superiority of the shortcut formula in computational efficiency.Finally,the proposed method is used to analyze the CD4 data set to illustrate the practicality of the method.(2)For the semiparametric partially linear varying coefficient model with longitudinal data,the parameters in the full model and each candidate submodel are estimated by employing the basis function expansion and the generalized estimating equation(GEE).According to the relationship between the linear parameter estimators under the submodel and the full model,the asymptotic properties of the linear parameter estimators under each candidate submodel are constructed,and then the basic formula of the focused information criterion(FIC)is derived.Moreover,a model averaging estimator named smoothed FIC(SFIC)is established,the limiting distribution of this estimator is developed,and the confidence interval for the focus parameter which the coverage probability approaching the nominal level is constructed.The simulation study shows that whether or not the working correlation structure is misspecified,the proposed model averaging procedure based on SFIC is superior to the traditional model selection methods.Finally,the studied method is applied to analyze the CD4 data set.(3)For the partially linear varying coefficient quantile regression model,the spline is used to approximate the varying coefficient function,and the estimator of the linear parameter under each candidate submodel is obtained by minimizing the quantile loss function.Then the model averaging estimator of the focus parameter is constructed,the asymptotic distribution of the model averaging estimator is established,a confidence interval with an actual coverage probability that tends to the nominal level is constructed.Simulation studies demonstrate that the model averaging estimator is better than the traditional model selection estimators.(4)For the semiparametric partially linear model with missing covariates,the inverse probability weighted method is employed to obtain the kernel estimator of the unknown parameter under each candidate submodel,and the asymptotic distribution of the estimator is developed.According to the asymptotic distribution of the focus parameter estimator,a model selection method on the basis of FIC is constructed.Furthermore,the model averaging estimator and its asymptotic distribution are derived,and a suitable confidence interval is constructed for the focus parameter.A simulation study shows that the proposed model averaging estimator is better than the model selection estimators.Finally,the method studied in this dissertation is applied to analyze the ragweed pollen level data.
Keywords/Search Tags:Semiparametric regression models, Longitudinal data, Missing data, Model selection, Model averaging
PDF Full Text Request
Related items