PolyMNIST

Introduced by Sutter et al. in Generalized Multimodal ELBO

The dataset is based on the original MNIST dataset. Compared to the original dataset, the digits are scaled down by a factor of $0.75$ such that there is more space for the random translation.The PolyMNIST consists of 5 different modalities.

The background of every modality $\mathbf{x}_m$ consists of random patches of size $28 \times 28$ from a large image. And the digit is placed at a random position of the patch. Using this setup, every modality has modality-specific information given by its background image and shared information given by the digit, which is shared between all modalities. An additional difficulty compared to the original PolyMNIST is the random translation of the digits

Papers


Paper Code Results Date Stars

Dataset Loaders


No data loaders found. You can submit your data loader here.

Tasks


Similar Datasets


License


  • Unknown

Modalities


Languages