Font Size: a A A

Imputation Methods And Simulation Compared Of Non-response In The Survey Data

Posted on:2013-04-06Degree:MasterType:Thesis
Country:ChinaCandidate:Y L LiuFull Text:PDF
GTID:2247330395484545Subject:Statistics
Abstract/Summary:PDF Full Text Request
Non-response is a common and difficult problem in the statistical investigation. It is always a concerned problem by the statisticians. Non-response data often leads to the deviation of the analytical results, increases the difficulty of the statistical analysis, reduces the quality of statistical outcomes. Research of non-response has the very large theoretical and applied value.Preventing is the most effective approach to deal with the non-response in the survey, namely improving the response rate as good as possible in the process of collecting data. However, because the objective problems in survey data are complicated, practical experiences show that non-response is inevitable problem. The method of dealing with the non-response is also extremely important, especially the imputation method of non-response item. Nowadays, compared with foreign countries, research on non-response data in our country is still relatively less.Different non-response problems need different imputation methods. The dissertation summarizes all most imputation methods of non-response in the survey data, including directly deleted method, weighted complete-case analysis, single imputation methods, multiple imputation methods, and the superiority of the methods were compared.In the dissertation, we simulate the EM algorithm imputation method, MCMC-Gibbs imputation method, regression imputation method and mean imputation method used in the establishment of regression model containing non-response data for further simulation research. Non-response data is randomly generated by normal distribution. The non-response rate are5%,10%,20%,30%,40%and50%, respectively. Number of regression variables are2,3and5respectively. Those cases are compared by simulation. Simulation results show that, in the same non-response rate, EM algorithm imputation method and regression imputation method have a small deviation, mean square error is also small, the MCMC-Gibbs imputation method has a big value deviation and mean square error is relatively larger, mean imputation method has a biggest value deviation and mean square error in all the methods. As non-response rate increases, the proportion of the available sample data becomes smaller, random fluctuations of the imputation values become bigger, but the mean square error of imputation values is relatively stable.
Keywords/Search Tags:Non-response, Non-response rate, MSE, EM algorithm, MCMC Method
PDF Full Text Request
Related items