Font Size: a A A

The impact of missing data treatments in a multiple regression analysis: A Monte Carlo comparison of deterministic imputation, stochastic imputation, multiple imputation, and the deletion procedure

Posted on:1997-12-31Degree:Ph.DType:Dissertation
University:University of South FloridaCandidate:Newsome, Dwight HowardFull Text:PDF
GTID:1460390014984558Subject:Educational tests & measurements
Abstract/Summary:
This study investigated, within the context of a three predictor multiple regression analysis with randomly missing data, the effects of ten missing data treatments on the sample estimate of R$sp2$ and each regression coefficient. Missing data treatments compared were: (a) listwise deletion, (b) pairwise deletion, (c) mean substitution, (d) simple regression imputation, (e) multiple regression imputation, (f) mean substitution with an added random residual value, (g) simple regression with an added random residual value, (h) multiple regression with an added random residual value, (i) multiple imputation using stochastic simple regression imputes (where the number of imputes is n = 10, 50, 100), and (j) multiple imputation using stochastic multiple regression imputes (n = 10, 50, 100).;A Monte Carlo method was used in which one thousand samples of 50, 100, and 200 were drawn with replacement from a pseudo-population of 18,170. Six proportions of data were randomly deleted within each sample representing cases with missing data. Treatments of missing value data sets were computed and resulting regression parameter(s) compared with the complete data condition. Dependent variables analyzed were the deviation of the obtained value of R$sp2$ from that in the complete data condition and the deviation of each regression coefficient from those in the complete data condition. Data were analyzed by computing effect sizes obtained from missing data treatments relative to the complete data condition.;Deletion procedures were more effective than imputation procedures, with pairwise deletion being the most effective procedure of all. Each deterministic imputation procedure outperformed its stochastic counterpart, with deterministic multiple regression imputation being the most effective. Multiple imputation procedures using stochastic regression imputes slightly outperformed the stochastic regression procedure which provided imputes. No evidence was found that increasing the number of imputations beyond n = 10 increased effectiveness of multiple imputation procedures.
Keywords/Search Tags:Multiple, Regression, Data, Imputation, Stochastic, Procedure, Deletion, Added random residual value
Related items