Comparison of MICE and Regression Imputation for Handling Missing Data

  • Berliana Devianti Putri airlangga university
  • Hari Basuki Notobroto Public Health Airlangga University
  • Arief WIbowo Airlangga University

Abstract

Background: Data collection activities have a higher risk of missing data. Missing data may produce biased estimates and standard errors increased, so imputation method is needed. The purpose of this study was to investigate which imputation method is the most appropriate to use for handling missing data. The strategies evaluated include complete case analysis, Multivariate Imputation by Chained Equation (MICE), and Regression Imputation. Methods: This study was non-reactive study and used raw data RPJMN 2015 Survey from BKKBN East Java Province. There were three incomplete data sets were generated from a complete raw dataset with 5%, 10%, and 15% missing data. Incomplete data sets were made missing completely at random. Results: Based on Friedman Test, both of imputation methods produced estimates which was no different with complete raw data set. Based on Mean Square Error analysis, MICE provided MSE values less and more stable than Regression Imputation in all scenarios. Conclusion: Multivariate Imputation by Chained Equation (MICE) was the most recommended method to use for handling missing data less than 15%.


Keywords: Missing data, MICE, Regression Imputation

Downloads

Download data is not yet available.
Published
Feb 4, 2018
How to Cite
PUTRI, Berliana Devianti; NOTOBROTO, Hari Basuki; WIBOWO, Arief. Comparison of MICE and Regression Imputation for Handling Missing Data. Health Notions, [S.l.], v. 2, n. 2, feb. 2018. ISSN 2580-4936. Available at: <http://heanoti.com/index.php/hn/article/view/162>. Date accessed: 22 may 2018.
Section
Research Article