ImageNet-P consists of noise, blur, weather, and digital distortions. The dataset has validation perturbations; has difficulty levels; has CIFAR-10, Tiny ImageNet, ImageNet 64 × 64, standard, and Inception-sized editions; and has been designed for benchmarking not training networks. ImageNet-P departs from ImageNet-C by having perturbation sequences generated from each ImageNet validation image. Each sequence contains more than 30 frames, so to counteract an increase in dataset size and evaluation time only 10 common perturbations are used.
29 PAPERS • 1 BENCHMARK
Wild-Time is a benchmark of 5 datasets that reflect temporal distribution shifts arising in a variety of real-world applications, including patient prognosis and news classification. On these datasets, we systematically benchmark 13 prior approaches, including methods in domain generalization, continual learning, self-supervised learning, and ensemble learning.
4 PAPERS • NO BENCHMARKS YET