Missingness Augmentation: A General Approach for Improving Generative Imputation Models
Despite tremendous progress in missing data imputation task, designing new imputation models has become more and more cumbersome but the corresponding gains are relatively small. Is there any simple but general approach that can exploit the existing models to further improve the quality of the imputation? In this article, we aim to respond to this concern and propose a novel general data augmentation method called Missingness Augmentation (MA), which can be applied in many existing generative imputation frameworks to further improve the performance of these models. For MA, before each training epoch, we use the outputs of the generator to expand the incomplete samples on the fly, and then determine a special reconstruction loss for these augmented samples. This reconstruction loss plus the original loss constitutes the final optimization objective of the model. It is noteworthy that MA is very efficient and does not need to change the structure of the original model. Experimental results demonstrate that MA can significantly improve the performance of many recently developed generative imputation models on a variety of datasets. Our code is available at https://github.com/WYu-Feng/Missingness-Augmentation.
READ FULL TEXT