Efficient Multiple Imputation for Diverse Data in Python and R: MIDASpy and rMIDAS

Dublin Core

Title

Efficient Multiple Imputation for Diverse Data in Python and R: MIDASpy and rMIDAS

Subject

missing data, multiple imputation, machine learning, Python, R.

Description

This paper introduces software packages for efficiently imputing missing data using
deep learning methods in Python (MIDASpy) and R (rMIDAS). The packages implement
a recently developed approach to multiple imputation known as MIDAS, which involves
introducing additional missing values into the dataset, attempting to reconstruct these
values with a type of unsupervised neural network known as a denoising autoencoder, and
using the resulting model to draw imputations of originally missing data. These steps are
executed by a fast and flexible algorithm that expands both the quantity and the range of
data that can be analyzed with multiple imputation. To help users optimize the algorithm
for their particular application, MIDASpy and rMIDAS offer a host of user-friendly tools
for calibrating and validating the imputation model. We provide a detailed guide to these
functionalities and demonstrate their usage on a large real dataset.

Creator

Ranjit Lall

Source

https://www.jstatsoft.org/article/view/v107i09

Publisher

University of Oxford

Date

October 2023

Contributor

Fajar bagus W

Format

PDF

Language

England

Type

Text

Files

Citation

Ranjit Lall, “Efficient Multiple Imputation for Diverse Data in Python and R: MIDASpy and rMIDAS,” Repository Horizon University Indonesia, accessed April 17, 2025, https://repository.horizon.ac.id/items/show/8312.