Font Size: a A A

Empirical Likelihood Inferences For Three Classes Of Statistical Models With Missing Data

Posted on:2009-01-21Degree:MasterType:Thesis
Country:ChinaCandidate:L LiFull Text:PDF
GTID:2120360245959508Subject:Probability theory and mathematical statistics
Abstract/Summary:PDF Full Text Request
Item non-response occurs frequently in daily life. It happens in opinion polls, market re-search surveys, medical studies and other scientific experiments. In such circumstances, the usualinferential procedures for complete data sets cannot be applied directly. It needs to do some treat-ments on data before we can use usual statistical approaches. A common method is to imputevalues for each missing response in order to obtain a'complete sample'set and then apply stan-dard statistical methods. Statistical inference for missing data is an important research field (e.g.Little and Rubin, Statistical Analysis with Missing Data[M], New York: John Wiley and Sons2002). Wang and Rao (Empirical likelihood for linear regression models under imputation formissing responses[J],Canadian J Statist, 2001, 29: 597-608) obtain empirical likelihood (EL)confidence intervals/regions for the regression coefficient in a linear model with fixed design pointsand missing data. They use regression imputation method to fill in missing data, construct an ELstatistic based on'complete sample'after imputation, and show that the EL statistic has a limitingdistribution of a weighted sum of chi-squared variables with unknown weights. They need to usean adjusted EL to obtain a confidence region on regression coefficient, in which the adjustmentcoefficient needs to be estimated. This would lead to a loss of the accuracy of the confidence re-gion. In chapter 2 of this paper, we use a new method to produce a'complete sample'set. Basedon the data set, we construct an EL statistic which has the limiting distribution of chi-squaredvariable. Based on our result, we can construct an EL confidence region on regression coefficientwithout adjustment, which can improve the accuracy of the confidence region. Comparison ofdifference of populations is an important research topic in medical studies, economical and educa-tional fields. Qin Yongsong and Zhao Lincheng ( Semi-parametric likelihood confidence intervalsfor various differences of two populations[J], Statistics and Probability Letters, 1997, 33(2): 135-143;Empirical likelihood confidence intervals for quantile differences of two populations [J],Chinese Ann Math, 1997, 18A(6): 687-694;Semi-empirical likelihood confidence intervals forquantile differences of two samples[J], Acta Mathematicae Applicatae Sinica, 1998, 21(1): 103-112;Empirical likelihood ratio confidence intervals for various differences of two populations[J],System Science and Mathematical Sciences, 2000, 13: 23-30) systematically study the construc-tion of EL confidence intervals for various differences of two populations under complete data. Qin and Zhang ( Empirical likelihood confidence intervals for differences between two datasets withmissing data[J], Pattern Recognition Letters, 2008, 29(6):803-812) construct EL confidence in-tervals for differences of two nonparametric populations under MCAR missing mechanism. Theyuse (single) random imputation method to fill in missing data. In chapter 3 of this paper, we usefractional imputation method to impute missing data, and obtain EL confidence intervals for differ-ences of two nonparametric populations under MCAR missing mechanism, which can improve theaccuracy of the confidence intervals. In chapter 4 of this paper, we generalize the results in chapter3 to the case of MAR missing mechanism, and obtain EL confidence intervals for differences oftwo nonparametric populations under MAR missing mechanism.Here we summary some new findings in this paper.1. In studying the construction of confidence intervals for the regression coefficient in alinear model with fixed design points and missing data, we propose a new method to produce a'complete sample'set. Based on the data set, we construct an EL statistic which has the limitingdistribution of chi-squared variable. Based on this result, we can construct an EL confidence regionon regression coefficient without adjustment, which can improve the accuracy of the confidenceregion.2. Under incomplete data and MCAR missing mechanism, we use fractional imputationmethod (a repeated imputation method) to impute missing data, and obtain EL confidence intervalsfor differences of two nonparametric populations. The usual (single) imputation method is a specialcase of fractional imputation. As the repeated time increases, fractional imputation can reduce theimputation variance. Comparing with single imputation, fractional imputation can improve theaccuracy of the confidence intervals.3. Under incomplete data and MAR missing mechanism, we use fractional imputation methodto impute missing data, and obtain EL confidence intervals for differences of two nonparametricpopulations. MAR is a weaker restriction than MCAR, and MAR is easy to be satisfied in realapplications.
Keywords/Search Tags:missing data, MCAR missing mechanism, MAR missing mechanism, frac-tional imputation, empirical likelihood
PDF Full Text Request
Related items