The MNIST database (Modified National Institute of Standards and Technology database) is a large collection of handwritten digits. It has a training set of 60,000 examples, and a test set of 10,000 examples. It is a subset of a larger NIST Special Database 3 (digits written by employees of the United States Census Bureau) and Special Database 1 (digits written by high school students) which contain monochrome images of handwritten digits. The digits have been size-normalized and centered in a fixed-size image. The original black and white (bilevel) images from NIST were size normalized to fit in a 20x20 pixel box while preserving their aspect ratio. The resulting images contain grey levels as a result of the anti-aliasing technique used by the normalization algorithm. the images were centered in a 28x28 image by computing the center of mass of the pixels, and translating the image so as to position this point at the center of the 28x28 field.
6,994 PAPERS • 52 BENCHMARKS
Fashion-MNIST is a dataset comprising of 28×28 grayscale images of 70,000 fashion products from 10 categories, with 7,000 images per category. The training set has 60,000 images and the test set has 10,000 images. Fashion-MNIST shares the same image size, data format and the structure of training and testing splits with the original MNIST.
2,790 PAPERS • 17 BENCHMARKS
The STL-10 is an image dataset derived from ImageNet and popularly used to evaluate algorithms of unsupervised feature learning or self-taught learning. Besides 100,000 unlabeled images, it contains 13,000 labeled images from 10 object classes (such as birds, cats, trucks), among which 5,000 images are partitioned for training while the remaining 8,000 images for testing. All the images are color images with 96×96 pixels in size.
966 PAPERS • 17 BENCHMARKS
The Caltech101 dataset contains images from 101 object categories (e.g., “helicopter”, “elephant” and “chair” etc.) and a background category that contains the images not from the 101 object categories. For each object category, there are about 40 to 800 images, while most classes have about 50 images. The resolution of the image is roughly about 300×200 pixels.
581 PAPERS • 7 BENCHMARKS
MVTec AD is a dataset for benchmarking anomaly detection methods with a focus on industrial inspection. It contains over 5000 high-resolution images divided into fifteen different object and texture categories. Each category comprises a set of defect-free training images and a test set of images with various kinds of defects as well as images without defects.
288 PAPERS • 4 BENCHMARKS
The dataset is constructed from images of defective production items that were provided and annotated by Kolektor Group d.o.o.. The images were captured in a controlled industrial environment in a real-world case.
13 PAPERS • 3 BENCHMARKS
KolektorSDD2 is a surface-defect detection dataset with over 3000 images containing several types of defects, obtained while addressing a real-world industrial problem.
11 PAPERS • 4 BENCHMARKS
UBI-Fights - Concerning a specific anomaly detection and still providing a wide diversity in fighting scenarios, the UBI-Fights dataset is a unique new large-scale dataset of 80 hours of video fully annotated at the frame level. Consisting of 1000 videos, where 216 videos contain a fight event, and 784 are normal daily life situations. All unnecessary video segments (e.g., video introductions, news, etc.) that could disturb the learning process were removed.
7 PAPERS • 2 BENCHMARKS
This is a synthetic dataset for defect detection on textured surfaces. It was originally created for a competition at the 2007 symposium of the DAGM (Deutsche Arbeitsgemeinschaft für Mustererkennung e.V., the German chapter of the International Association for Pattern Recognition). The competition was hosted together with the GNSS (German Chapter of the European Neural Network Society).
5 PAPERS • 3 BENCHMARKS
Multi-pose Anomaly Detection (MAD) dataset, which represents the first attempt to evaluate the performance of pose-agnostic anomaly detection. The MAD dataset containing 4,000+ highresolution multi-pose views RGB images with camera/pose information of 20 shape-complexed LEGO animal toys for training, as well as 7,000+ simulation and real-world collected RGB images (without camera/pose information) with pixel-precise ground truth annotations for three types of anomalies in test sets. Note that MAD has been further divided into MAD-Sim and MAD-Real for simulation-to-reality studies to bridge the gap between academic research and the demands of industrial manufacturing.
2 PAPERS • 1 BENCHMARK
MIAD contains more than 100K high-resolution color images in various outdoor industrial scenarios, designed for unsupervised anomaly detection. This dataset is generated by a 3D graphics software and covers both surface and logical anomalies with pixel-precise ground truth.
1 PAPER • NO BENCHMARKS YET