Font Size: a A A

A Probability Based Framework for Testing the Missing Data Mechanism

Posted on:2014-06-07Degree:Ph.DType:Dissertation
University:University of California, Los AngelesCandidate:Lin, Johnny Cheng-HanFull Text:PDF
GTID:1450390005985641Subject:Statistics
Abstract/Summary:
Many methods exist for imputing missing data but fewer methods have been proposed to test the missing data mechanism. Little (1988) introduced a multivariate chi-square test for the missing completely at random data mechanism (MCAR) that compares observed means for each pattern with expectation-maximization (EM) estimated means. As an alternative, this manuscript proposed two new ways of testing MCAR that use estimated parameters from missingness indicators rather than moment information from observed scores. The first statistic in the probability-based (PBB) family, PBB-MCAR I, is a chi-square test of independence that tests the assumption that missingness indicators are independent among all grouping patterns. The second statistic, PBB-MCAR II, is a chi-square goodness of fit statistic that tests differences of observed versus expected probabilities conditional on ranked values of a suspect variable that drives missingness dependencies. A simulation study showed that although Little's test consistently maintained optimal Type I error rates, the empirical power of PBB-MCAR II to detect violations of MCAR was on par with Little's test under most conditions, whereas PBB-MCAR I had lower power to detect aberrations of MCAR because it tests a more restricted set of independence assumptions. These newly-developed test statistics were demonstrated in two education-based applications, a) as a way of testing the missing data mechanism when creating longitudinal trajectories of intramural sports participation among African American students, and b) as a tool to detect departures from completely at random test-taking. Future work will involve creating an R package to promote the use of these missing data tests among education researchers, extending PBB-MCAR II to incorporate auxiliary variables, and resolving the problem of sparse missing data patterns by adopting the limited information goodness of fit test proposed by Maydeu-Olivares and Joe (2005).
Keywords/Search Tags:Missing data, Test, PBB-MCAR II, Proposed
Related items