Font Size: a A A

Study On Statistical Test, Model Selection And Its Relative Problems

Posted on:2016-06-10Degree:DoctorType:Dissertation
Country:ChinaCandidate:X C XiaFull Text:PDF
GTID:1227330503452328Subject:Statistics
Abstract/Summary:PDF Full Text Request
Hypothesis test for regression parameters and model selection are two important aspects in the study of statistics. Both can be used to reduce the risk of misspecification of model, to simplfy the complexity of model and to enhance the prediction ability of model. The study on these two aspects is of necessity and has largely potencial value in both theoretical development and practical application. This motivates the current thesis to investigate both, especially for the latter, based mainly on semiparametric models. Five concrete problems are considered below.In Chapter 2, focused on the case of missing response and errors-in-variables(EV), the problem on testing the hypothesis on the parameters in the linear part of partially linear varying coefficient model is considered. First, estimators of the parameter and the nonparametric coefficients in the model are derived under the linear constraint. Then, to test the constraint, two approaches, based separately on Lagrange multiplier test and corrected residual sum of squared errors test, are proposed. Under null hypothesis, it is proved that the above two tests are equivalent in the sense that the proposed two statistics not only share the same chi-squared limiting distribution, but also are exactly equal in magnitude. Last, a simulation study and a real data analysis are carried out to evaluate the correctness of the two testing approaches.In Chapter 3, still focused on the model of Chapter 2, the problem on variable selection that incorporates adaptive Lasso and SCAD penalty is considered. Under some mild conditions, the consistency and oracle properties for the proposed adaptive Lasso estimator and SCAD estimator are proved. Furthermore, some implemental issues are discussed, including two algorithms to find the related solution, the computation of standard error formula and the criteria on the choice of tuning parameter. Last, a simulation study verifies empirically that the proposed variable selection procedures are workable and effective.In Chapter 4, concentrated on the setting where observations of covariates are not independent and identically distributed and measurement errors exist in a part of variables, the problem on variable selection with SCAD penalty in the partially time varying coefficient model is considered. The estimator of parameter in the linear part enjoys the consistency and oracle property under some technical conditions when the sequence of latent variables is ?-mixing. Also, a test for the parameter based on least squared penalization is discussed. Theoretically, it is proved that the penalized least squared test statistic is no more a commonly encountered chi-square distribution, but a weighted chi-sqaure version, which implies that the Wilks’ s phenonmenon does not hold in this model. Results of simulation and empirical analysis are reported to support the proposed method.In Chapter 5, focused on the case of diverging number of covariates, the problem on variable selection for multiplicative model under the least absolute relative error(LARE) loss is considered. The estimator for the parameter is obtained through carrying out a genral weighted 1L penalization. It is proved theoretically that the highest dimension that the proposed apparoch can handle may acheive 1/2()np ?o n for consistency and 1/3()np ?o n for oracle property for the estimator. To overcome the unsparsity of the solution that is obtained via using a nonsmooth algorithm, an alternative variable selection procedure for practical use is developed. And it is verified that it not only possesses the same asymptotic properties, but also can be more quickly computed. This practical selection can perform much better than the LAD method from the numerical studies.In Chapter 6, focused on the practical situation that one may encounter the data with Bernoulli response and covariates with the number much larger than the sample size, the problem on untrahigh dimensional variable screening in the framework of generalized varying coefficient model is considered. Two screening procedures based respectively on marginal maximum likelihood estimation and marginal likelihood ratio statistic are developed. Results including screening consistency and ranking consistency are established under some technical conditions, which extend the methods and results of Fan and Song(2010) and Fan et al.(2014) to some extent. Some refined algorithms like iterative screening and greedy screening are presented also. Finally, numerical studies indicate that the proposed nonparametric screening approaches work well and better than screening approaches based on parametric models.
Keywords/Search Tags:Partially linear varying coefficient model, Measurement error, Typothesis test, Variable selection, High-dimensional screening
PDF Full Text Request
Related items