Font Size: a A A

Statistical methods for missing data in complex sample surveys

Posted on:2010-04-20Degree:Ph.DType:Dissertation
University:University of MichiganCandidate:Andridge, Rebecca RobertsFull Text:PDF
GTID:1440390002987399Subject:Statistics
Abstract/Summary:
Missing data are a pervasive problem in large-scale surveys, arising when a sampled unit does not respond to a particular question or to the entire survey. This dissertation addresses two major topics in survey nonresponse: hot deck imputation and evaluation of nonresponse bias.Chapter II contains an extensive review of hot deck imputation, which despite being used extensively in practice has theory that is not as well developed as that of other imputation methods. One of the understudied areas discovered in this review is the topic for the subsequent chapter: Chapter III addresses the use of sample weights in the hot deck. The naive approach is to ignore sample weights in creation of adjustment cells, which effectively imputes the unweighted sample distribution of respondents in an adjustment cell, potentially causing bias. Alternative approaches have been proposed that incorporate weights into the probabilities of selection for each donor (Cox, 1980; Rao and Shao, 1992). In this chapter we show that these weighted hot decks do not correct for bias when the outcome is related to the sampling weight and the response propensity, and suggest an alternative, simple to implement method that can correct the bias in these situations.The first chapters concern methods for imputing missing data in the case where (at worst) missingness is at random (MAR) (Rubin, 1976); the final two chapters focus instead on a method for estimating and correcting nonresponse bias when missingness may not be at random (NMAR). Chapter IV introduces proxy pattern-mixture analysis (PPMA), a new method for assessing and adjusting for nonresponse bias for a continuous survey outcome when there is some fully observed auxiliary information available. We describe the PPMA model along with several different estimation strategies and propose a sensitivity analysis to capture a range of potential missingness mechanisms. In Chapter V we propose an extension of the PPMA to binary and ordinal outcomes using probit models. In addition, the important issue of model misspecification is discussed. Throughout the dissertation, methods are illustrated through simulation and application to data from NHANES III.
Keywords/Search Tags:Data, Sample, Methods, Survey
Related items