Font Size: a A A

Empirical Likelihood And Composite Inference Methods For Some Complicated Data Models

Posted on:2014-08-15Degree:DoctorType:Dissertation
Country:ChinaCandidate:X S ZhouFull Text:PDF
GTID:1260330425962094Subject:Probability theory and mathematical statistics
Abstract/Summary:PDF Full Text Request
Empirical likelihood method, as a nonparametric method; has received more and more attention since it was first proposed by Owen (1988). It has been popularly used for constructing confidence regions for some interesting parameters and smooth functions. Many advantages of empirical likelihood over normal approximation method have been shown in the literatures. For example, it is known that the shape and orientation of empirical likelihood based confidence regions are determined entirely by the data, and also these regions are range preserving and transformation respecting. Today, as an important nonparametric method, empirical likelihood has become a very useful tool for statistical inference. Many authors have used the method for linear, nonparametric and semiparametric regression models. In many application fields, such as industry and agriculture production, society investigation, economics, biomedical sciences and epidemiology and so on, complicated data such as measurement error data, missing data, censored data are often encountered. How to deal with these complicated data to derive efficient inferences has become one of the hot issues in modern statistical analysis. In this thesis, one important issue is to employ the empirical likelihood tool to investigate two classes of semiparametric models with complicated data, these work further broadens the application areas of empirical likelihood.With the development of applied sciences, semiparametric regression models have been well researched and popularly used for their flexibility and interpretability. A-mong semiparametric models, varying-coefficient partially linear model and additive partially linear model are two classes of commonly-used models because they effective-ly avoid the "curse of dimensionality" of nonparametric model and have the explana-tory power of the linear regression model. So in this thesis, we employ the empirical likelihood method to infer the parametric and nonparametric components for longi-tudinal additive partially linear error-in-variables model in Chapter2and empirical likelihood inferences for longtudinal semiparametric varying-coefficient partially linear error-in-variables model with missing responses in Chapter3.Longitudinal semiparametric additive partially linear error-in-variables model has the form as follows: where Yij is the response variable for the jth measurement of the ith subject, Xij is the covariate X∈RP for the jth measurement of the ith subject, Zij=(ZijI,...,ZijD)T is the covariate Z∈RD for the jth measurement of the ith subject,f1,...,fD are unspecified smooth functions,β=(β1,...,βp)T a p-dimensional vector of unknown pa-rameters, εij is the random error satisfying E(εij|Xij, Zij)=0, Uij is the measurement error with mean zero satisfying E[Uij)=0and Cov(Uij)=Σuu, and εij is independent of(Xij,Zij,Yij). For simplicity, we consider the case of D=2. To ensure identifiabil-ity of the nonparametric functions, we assume that E{f1(Z1)}=E{f2(Z2)}=0, and we also assume that X and Y are centered. By correction-for-attenuation, we get a corrected-attenuation auxiliary vector as an estimating function for the unknown parameter and then define the corresponding corrected-attenuation block empirical likelihood ratio function, and prove that the proposed statistic for the unknown pa-rameter has a standard chi-square limiting distribution asymptotically, and so it can be conveniently used to derive the confidence regions. Simulation studies indicate that, by comparing coverage probabilities and average lengths of the confidence intervals, the proposed method outperforms the profile-based least-squares method which has been studied by Liang, Thurston, Duppert, Apanasovich and Hauser (2008). Based on the proposed block empirical likelihood ratio for the parameter β, we can easily obtain the maximum empirical likelihood estimator (MELE) β of β, and further the cor-rected backfitting estimators of the nonparametric functions. So the residual-adjusted empirical log-likelihood ratio statistics for nonparametric functions are given and the nonparametric Wilk’s theorems are also obtained. It is worth to point out that our inference for f1(z1) does not need to accurately estimate the nonparametric function f2(z2) at any point, we only need to know some values of the corrected backfitting estimator for f2(z2) at the sample observations.Similar to the ideas of chapter2, in chapter3, we study the empirical likelihood inferences for longitudinal semipararnetric varying coefficient partially linear error-in-variables model with missing responses, suppose the observed data{Yij, Zij, Uij, Wij, δij, i=1,..., n,j=1,...,ni} is a incomplete random sample from the following model: where Yij is the response variable for the jth measurement of the ith subject, Zij, Zij,Xij and Uij are the covariates for jth measurement of the ith subject.β=(β1,...,βp)T is a p-dimensional vector of unknown parameters, α(·)=(α1(·).....αq(·))T is a q-dimensional coefficient vector of unknown functions, εij is the measurement error for the jth measurement of the ith subject, which satisfies E(εij|Xij, Zij,Uij)=0, uar(εij|Xij, Zij,Uij)=σ2, E(Vij)=0and Cov(Vij)=Σuu. In addition, the co-variates Xij, Zij and Uij can be completely observed, and δij=1means that. Yij is observed,δij=0indicates that Yij is missing. First, we construct the correction-for-attenuation block empirical loglikelihood ratio statistics for the unknown parameter β, and then prove that the limit distribution of the proposed statistics is a standard chi-square distribution, based on the relevant theorems, we can obtain the confidence region for unknown parameter3. Simulation studies indicate that, by comparing cover-age probabilities and average lengths of the confidence intervals, the proposed method outperforms the profile-based least-squares method.As a robust estimating method, quantile regression has been widely used in quan-titative economics, social science and biological medicine. The advantage of quantile regression is that it does not require the secondary moment of the error term exist, and the disadvantage is the efficiency of quantile regression estimation is very low some-times. Zou and Yuan (2008) proposed a new parameter estimation method in linear model, named as composite quantile regression method. Composite quantile regression inherit the robustness of quantile regression method, and it can improve the efficiency of quantile regression estimation significantly. It suppose that the impact for predictor variables of different quantile levels is the same, and the difference is the intercept term.Compared with the classic least squares method, composite quantile regression is not sensitive to outliers and has robust property, it can improve the efficiency of least square estimator in most cases. In chapter4, we combine the empirical likelihood procedure and composite quantile regression method, construct the confidence region for the unknown parameter in the linear regression model. Consider the following linear model: where β=(β1,...,βp)T∈Rp is the unknown regression coefficient vector. The thought of composite quantile method is to consider multiple quantlie regression models, com-bine the information of different quantile regression model, and the regression coefficient is the same for different regression model. Denote0<τ1<τ2<...<τq<1, bτ is the100τ(?) quantile of ε. we first construct two kinds of estimation equation of the un-known parameter β, one is the estimation equation Zi1(β) based on composite quantile regression method, the other is the estimation equation Zi2(β) based the quantilewise regression outcome. thus we can construct the corresponding empirical likelihood ratio statistics and max-imum empirical likelihood estimator, and we prove that the asymptotic distribution of empirical likelihood ratio statistics is the standard chi-square distribution.To achive the higher-order accuracy, we propose a smooth empirical likelihood approach by approximating the index function in quantile score function by a smooth function, we obtain the corresponding smooth estimation equations Zi1h(β) and Zi2h(β), prove the smoothed version of empirical likelihood ratio statistics is approximating standard chisuqare distribution, and stated that with Barlett correction, the accuracy of the smooth empirical likelihood confidence region can be improved with smaller convergence error.In chapter5, we study the following general nonparametric regression model: where Y is the response variable, T is a scalar independent of the random error ε, m(T)=E(Y|T) is the smooth nonparametric regression function. Standard error function σ(T) is always greater than zero. Suppose E(ε)=0and var(ε)=1. It is known that many smoothing methods, such as kernel regression, spline smoothing, orthogonal series approximation and local polynomial regression, have been proposed for nonparametric regression. Among the above linear smoothers, local polynomial regression, which has been tnorougmy studied in the hterature(Fan and Gijbels,1996) is the best linear smoother in terms of minimax efficiency. Suppose{{ti,yi)i=1,..., n} is a random sample from the above model. In our thesis, we are interested in the efficient estimator of the derivative m’(·) of nonparametric function instead of the nonparametric function m(·) itself. In chapter5, we derive the efficient estimator of m’(·) by compositing relevant quantile information. one way is to consider the weighted local quadratic composite quantile regression of the check function at different quantile position. where ρτk(z)=τkz-zl(z≤0), k=1,...,q is the quantile loss function at the q-quantile position, and ωk≥0, k=1,...,q, Σkq=1ωk=1is the weights which can be decided by the data. then the weighted local quadratic composite quantile regression estimator (WCQR) of m’(t0) can be described as m’WCQR(t0)=b1, and we obtain the asymptotic bias, asymptotic variance and asymptotic normality of m’WCQR(t0).Another way is the weighting composite of estimators, for a fixed τk,0<τk<1, we consider the following local quadratic nonparametric quantile regression: It is easy to know that the solution of b1in the above optimization problem is a esti-mator of m’(to), denoted as m’(Tk, to), for different τk=k/(q+1), k=1,2,...,q, the weighting average of m’(τk,t0) provide a new estimator of m’(to), named as the weight-ing quantile averaging estimator (WQAE), that is m’WQAE(t0)=Σkq=1ωkm’(τk,t0), where ω=(ω1,ω2,...,ωq)T is the weight vector satisfying Σkq=1ωk=1and Σkq=1ωkCk=0. we obtain the asymptotic bias, asymptotic variance and asymptotic normality of m’WCQR(to).Simulation studies illustrate that our proposed methods perform better than local quadratic least square estimator in terms of asymptotic relative efficiencies.
Keywords/Search Tags:empirical likelihood, semiparametric regression, nonparametricregression, additive partially linear model, varying coefficient partially linear model, composite quantile regression, local composite quantile regression, errors-in-variablesdata
PDF Full Text Request
Related items