Comparison of MICE and Regression Imputation for Handling Missing Data

  • Berliana Devianti Putri Faculty of Public Health, Airlangga University
  • Hari Basuki Notobroto Faculty of Public Health, Airlangga University
  • Arief Wibowo Faculty of Public Health, Airlangga University

Abstract

Data collection activities have a higher risk of missing data. Missing data may produce biased estimates and standard errors increased, so imputation method is needed. The purpose of this study was to investigate which imputation method is the most appropriate to use for handling missing data. The strategies evaluated include complete case analysis, Multivariate Imputation by Chained Equation (MICE), and Regression Imputation. This study was non-reactive study and used raw data RPJMN 2015 Survey from BKKBN East Java Province. There were three incomplete data sets were generated from a complete raw dataset with 5%, 10%, and 15% missing data. Incomplete data sets were made missing completely at random. Based on Friedman Test, both of imputation methods produced estimates which was no different with complete raw data set. Based on Mean Square Error analysis, MICE provided MSE values less and more stable than Regression Imputation in all scenarios. Conclusion: Multivariate Imputation by Chained Equation (MICE) was the most recommended method to use for handling missing data less than 15%.

Downloads

Download data is not yet available.
Published
Feb 28, 2018
How to Cite
PUTRI, Berliana Devianti; NOTOBROTO, Hari Basuki; WIBOWO, Arief. Comparison of MICE and Regression Imputation for Handling Missing Data. Health Notions, [S.l.], v. 2, n. 2, p. 183-186, feb. 2018. ISSN 2580-4936. Available at: <http://heanoti.com/index.php/hn/article/view/hn20207>. Date accessed: 22 aug. 2018.
Section
Research Article