Font Size: a A A

An application of the EM algorithm in analyzing the CUNY open-admissions study missing data

Posted on:1993-08-23Degree:Ph.DType:Dissertation
University:City University of New YorkCandidate:Na, HazonFull Text:PDF
GTID:1470390014997693Subject:Educational Psychology
Abstract/Summary:
The present study is based on an analysis of a sample from the CUNY open-admissions data set. The data set consisted of two portions, an original sample and a follow-up sample which contained only 14% of the original cases. Not only were data missing for those cases not in the follow-up sample, but the original sample variables were not completely observed. The data set is basically multivariate with both incomplete continuous and categorical variables. In analyzing such a data set, many researchers typically use ad hoc approaches that lack theoretical bases. For example, deletion or substitution methods are offered as a routine treatment for missing values before performing an analysis in many statistical packages.;It is important to note that deletion methods using only respondents with no missing values may yield biased results, unless the complete cases can be viewed as a completely random subsample of the original sample observations. A more realistic approach is to assume that the missing data are not missing in a completely random fashion, but rather are missing at random as a function of known subject characteristics. Further, given this more realistic assumption concerning the missing data process, one could apply Maximum Likelihood methods to estimate the parameters of interest. The Maximum Likelihood method was used in the present study.;In this study, the Maximum Likelihood estimates for means, variances, and correlations were obtained by implementing the Estimation-Maximization (EM) algorithm suggested by Little & Schulucter (1985). These Maximum Likelihood estimates were compared with the estimates obtained from three different ad hoc methods; Pairwise deletion, Listwise deletion, and Weighting analyses.;Although the results show some differences in terms of correlation estimates, there was little evidence that the methods yield different estimates of proportions, means and standard deviations. Possible explanations for this result are discussed. In general, however, the ad hoc and Maximum Likelihood methods will not agree.
Keywords/Search Tags:Data, Missing, Maximum likelihood, Ad hoc, Sample, Methods
Related items