Font Size: a A A

Estimation And Variable Selection Of Regression Models With Missing Data

Posted on:2019-12-13Degree:DoctorType:Dissertation
Country:ChinaCandidate:X W DingFull Text:PDF
GTID:1360330548973912Subject:Applied Statistics
Abstract/Summary:PDF Full Text Request
Statistical analysis on missing data has always been an interesting and important research topic in statistics.When the missingness data mechanism is missing at random,there has been many methods to analyze this type of data.However,in real applications,nonignorable missing data are often encountered in many research areas.Then the methods developed for dealing with ignorable missing data can not be used directly for analyzing nonignorable missing data.Meanwhile,the high dimensional missing data are frequently encountered in many areas,such as economics,biomedical science and social science.So it is worth to study this type of data since there is no work on this research topic.When the missing data are subject to nonignorable missingness,we develop a penalized estimation procedure for ultrahigh dimensional quantile regression model,study an adjusted empirical likelihood estimation procedure of distribution and quantile of response,as well as investigate the estimation problem for nonlinear regression models based on empirical likelihood method.When some covariates are subject to missing at random,the thesis considers a regularized estimation procedure for multiplicative regression model.Specifically,the main work of this thesis is summarized as follows.1.The thesis concerns the regularized quantile regression for ultrahigh dimensional data with responses not missing at random.We propose an inverse probability weighted and penalized objective function for regularized estimation using the nonconvex penalties SCAD and MCP.We develop two-step procedures to estimate the propensity score model with sparsity: the first step is to correctly identify the important features in the propensity score via using the Pearson chi-square type test statistics;the second step is to estimate the unknown parameter in the reduced propensity score model via employing the adjusted empirical likelihood method.Under some regularity conditions,we establish the oracle properties of the proposed regularized estimators.Simulation study and a real data analysis are investigated to illustrate the effectiveness and feasibility of the proposed methods.2.The thesis considers the estimation procedure of distribution functions and quantiles with nonignorable missing response data.Three approaches are developed to estimate distribution functions and quantiles,i.e.,the Horvtiz-Thompson-type method,regression imputation method and augmented inverse probability weighted approach.The propensity score is specified by a semiparametric exponential tilting model.To estimate the tilting parameter in the propensity score,we propose an adjusted empirical likelihood method to deal with the over-identified system.Under some regularity conditions,we investigate the asymptotic properties of the proposed three estimators for distribution functions and quantiles,and find that these estimators have the same asymptotic variance.The jackknife method is employed to consistently estimate the asymptotic variances.Simulation study and a real data analysis are investigated to illustrate the effectiveness and feasibility of the proposed methods.3.The thesis studies the estimation procedure for nonlinear regression models based on empirical likelihood under nonignorable missing response.Assuming the response model is a purely parametric model,we propose a semiparametric likelihood method to obtain the consistent estimator of parameters in propensity score model.To reduce the dimension of response model,we propose a penalized semiparametric likelihood method for parameter estimation and variable selection of propensity score model simultaneously.By utilizing an appropriate penalty function,we show that the penalized semiparametric likelihood estimator has the oracle property.Two types of estimating equations under nonignorably,namely inverse probability weighted and augmented inverse probability weighted estimating equations,are defined to construct empirical likelihood functions.Our theoretical results reveal that the empirical log-likelihood ratio functions are asymptotically standard chi-squared distributed when the parameters of response model are known previously.When the parameters of response model are estimated consistently via semiparametric likelihood or penalized semiparametric likelihood method,the empirical log-likelihood ratio functions are asymptotically weighted chi-squared distributed.The asymptotic normality of the regression parameters is also established systematically.Simulation study and a real data analysis are investigated to illustrate the effectiveness and feasibility of the proposed methods.4.The thesis studies the variable selection procedure of the multiplicative regression model with covariates missing at random.A weighted and relative error-based objective function using inverse probability weighting is proposed to remove the potential bias caused by missing data.A penalized and weighted objective function using the adaptive lasso penalty is proposed for variable selection of the model.Assuming the missing data problem remains a low problem the resultant estimator achieves the oracle property including fixed or diverging number of variables in regression model.An effective and fast algorithm is proposed for computing the regularized paths.The performance of the method is evaluated using Monte Carlo simulations.This thesis studies the estimation and variable selection procedure for missing data models,extends the statistical analysis method of full data to the missing data case,provides theoretical and methodological support for missing data analysis,and has broad application prospects.
Keywords/Search Tags:Variable selection, Missing not at random, Multiplicative regression model, Nonlinear regression model, Estimation of quantiles, Ultrahigh dimensional quantile regression model, Empirical likelihood, Missing at random
PDF Full Text Request
Related items