| Objective: A scale is a standardized form of measurement consisting of several questions or self-assessment indicators,usually used to measure a person’s state,behavior,or attitude.Scales always exist in the questionnaire and can quantify the qualitative data that we have investigated.However,the survey data may be missing because of not understanding the questions,the avoidance of sensitive problems and the omission of questions.For survey scales with missing data,researchers usually delete the record with missing values directly or fill it with the mean(mode)of this variable.Though those methods are simple and easy,they will reduce the observation sample size or result in the bias of the estimated parameter variance.Recent years,there are more and more studies of imputation method to various types of data missing.Scales are categorized as two-category data or ranked data.Recently many foreign researchers have solved it through hot-deck imputation and multiple imputation methods.However,few scholars at home or abroad have specially discussed and compared with those methods which are fit for the treatment of scale missing value.In this study,three kinds of people contained middle school students,college students and the elderly from nursing home are selected,including three two-category scales and three multi-category rating scales with ranked data.Direct deletion,mode imputation,hot-deck imputation and multiple imputation-logistic regression method are applied for handling missing data after simulation of missing,and we expect to find appropriate methods and strategies for scale data so as to help researchers to carry on the reasonable application in practice works.Method: Using the Monte Carlo technique to simulate missing data which missing mechanism is missing at random and missing pattern is arbitrarily,we simulated fifty times and the missing rate were 5%、10%、15%、20% and 25% respectively,then handled them with direct deletion,the mode imputation,the hot-deck imputation and the multiple imputation,then calculated the corresponding evaluation index on three levels,which are imputation accuracy,descriptive statistics analysis and statistical inference.In the end,we summarized those indicators above to explore the most suitable method for the missing value of scale data.All simulation process use SAS9.4 to complete by writing macro program.Results: For imputation accuracy,mode imputation can get the best result to Test Anxiety Scale among two-category scales,however,to Extroversion and Introversion Sub-scale and Life Orientation Scale,the hot-deck imputation is the best method while the missing proportion is low,and the mode imputation is the best method when the missing proportion is high,the mode imputation has the best effect to self-acceptance scales in the multi-category rating scales,the hot-deck imputation can get the best result to Activities of Daily Living Scale and Adolescents’ Psychological Resilience Scale.For descriptive statistics analysis,multiple imputation method can exactly estimate the mean of scale score,and hot-deck imputation can estimate the standard deviation of scale score best except the high missing ratio in Life Orientation Scale,direct deletion is the best way to score standard deviations of the scales.At the level of statistical inference,the multiple imputation-logistic regression has the best effect to estimate the correlation coefficients and the regression coefficients for two-category scales.For the multicategory rating scales,to Self-Acceptable Questionnaire,the multiple imputation is the best method to estimate the correlation coefficients while the missing proportion is low,and the mode imputation is the best method to estimate the correlation coefficient when the missing proportion is high.In all missing ratios,the multiple imputation has the best effect to estimate the regression coefficients.To the Activities of Daily Living Scale,the hot-deck imputation is the best method to estimate the correlation coefficients in all missing proportions,the hot-deck imputation is the best method to estimate the regression coefficients while the missing proportion is low,and the multiple imputation is the best method to estimate the regression coefficients when the missing proportion is high.The multiple imputation is the best way to estimate the correlation coefficients and regression coefficients for the Adolescents’ Psychological Resilience Scale Finally,we calculate the total index by weighting imputation accuracy,statistical description and statistical inference with 1,2 and 2 respectively.In a whole,we find that the multiple imputation method is the best method for two-category scales,and the hot-deck imputation is followed by,the mode imputation and the direct deletion have the worst effects.For the multi-category rating scales,hot-deck imputation is prior to multiple imputation aiming at the Activities of Daily Living Scale and the Adolescents’ Psychological Resilience Scale while the missing proportion is low,multiple imputation is prior to hot-deck imputation while the missing proportion is high.The multiple imputation is the best way for Self-Acceptable Questionnaire.Conclusion: In the practical application of analysis,as a whole the multiple imputation method is the best method to deal with scale missing data,hot-deck imputation followed closely by.And we can exactly estimate mean,correlation coefficient and regression coefficient by using multiple imputation while the hot-deck imputation is good to estimate the standard deviation.What’s more,the hot-deck imputation is superior to multiple imputation method when the missing proportion is low(5%、10%)and the scale is multi-category rating scale.Usually,the scale mostly belongs to the multicategory rating scale and the missing proportion is low,usually less than 10%,so it is suggested to deal with the missing data of the multi-category rating scale by using hotdeck imputation method. |