CI-MNIST (Correlated and Imbalanced MNIST) is a variant of MNIST dataset with introduced different types of correlations between attributes, dataset features, and an artificial eligibility criterion. For an input image $x$, the label $y \in \{1, 0\}$ indicates eligibility or ineligibility, respectively, given that $x$ is even or odd. The dataset defines the background colors as the protected or sensitive attribute $s \in \{0, 1\}$, where blue denotes the unprivileged group and red denotes the privileged group. The dataset was designed in order to evaluate bias-mitigation approaches in challenging setups and be capable of controlling different dataset configurations.
4 PAPERS • NO BENCHMARKS YET
We have cleaned the noisy IMDB-WIKI dataset using a constrained clustering method, resulting this new benchmark for in-the-wild age estimation. The annotations also allow this dataset to use for some other tasks, like gender classification and face recognition/verification. For more details, please refer to our FPAge paper.
3 PAPERS • 1 BENCHMARK
The LAGENDA dataset is a large-scale dataset with age and gender annotations for face and body bounding boxes. The dataset consists of 67,159 images from the Open Images Dataset and comprises 84,192 pairs (FaceCrop, BodyCrop). This dataset offers a high level of diversity, encompassing various scenes and domains. It contains minimal celebrity data, thus reflecting real-world, in-the-wild scenarios. The dataset spans a wide age range, from 0 to 95 years old.
3 PAPERS • 4 BENCHMARKS