Multiple imputation using chained equations: issues and guidance for practice

Multiple imputation by chained equations (MICE) is a flexible and practical approach to handling missing data. We describe the principles of the method and show how to impute categorical and quantitative variables, including skewed variables. We give guidance on how to specify the imputation model and how many imputations are needed. We describe the practical analysis of multiply imputed data, including model building and model checking. We stress the limitations of the method and discuss the possible pitfalls. We illustrate the ideas using a data set in mental health, giving Stata code fragments.

PDF Abstract
Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Multivariate Time Series Imputation Beijing Multi-Site Air-Quality Dataset MICE MAE (PM2.5) 27.42 # 6
Multivariate Time Series Imputation KDD CUP Challenge 2018 MICE MSE (10% missing) 0.468 # 4
Multivariate Time Series Imputation PhysioNet Challenge 2012 MICE MAE (10% of data as GT) 0.634 # 5
Multivariate Time Series Imputation UCI localization data MICE MAE (10% missing) 0.477 # 4

Methods


No methods listed for this paper. Add relevant methods here