Comparison of MICE and Regression Imputation for Handling Missing Data
Background: Data collection activities have a higher risk of missing data. Missing data may produce biased estimates and standard errors increased, so imputation method is needed. The purpose of this study was to investigate which imputation method is the most appropriate to use for handling missing data. The strategies evaluated include complete case analysis, Multivariate Imputation by Chained Equation (MICE), and Regression Imputation. Methods: This study was non-reactive study and used raw data RPJMN 2015 Survey from BKKBN East Java Province. There were three incomplete data sets were generated from a complete raw dataset with 5%, 10%, and 15% missing data. Incomplete data sets were made missing completely at random. Results: Based on Friedman Test, both of imputation methods produced estimates which was no different with complete raw data set. Based on Mean Square Error analysis, MICE provided MSE values less and more stable than Regression Imputation in all scenarios. Conclusion: Multivariate Imputation by Chained Equation (MICE) was the most recommended method to use for handling missing data less than 15%.
Keywords: Missing data, MICE, Regression Imputation
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.