A large set of images of cats and dogs.
11 PAPERS • 1 BENCHMARK
Four pathologists from Longhua Hospital Shanghai University of Traditional Chinese Medicine provide 600 images of gastric cancer pathology images at size 2048$\times$2048 pixels. These images were scanned using a NewUsbCamera and digitized at $\times$20 magnification, tissue-level labels were also given by the four experienced pathologists. Based on that, five biomedical researchers from Northeastern University cropped them to 245,196 sub-sized gastric cancer pathology images, and two experienced pathologists from Liaoning Cancer Hospital and Institute perform the calibration. The 245,196 images were split to three sizes (160$\times$160, 120$\times$120, 80$\times$80) for two categories: abnormal and normal.
So2Sat LCZ42 consists of local climate zone (LCZ) labels of about half a million Sentinel-1 and Sentinel-2 image patches in 42 urban agglomerations (plus 10 additional smaller areas) across the globe. This dataset was labeled by 15 domain experts following a carefully designed labeling work flow and evaluation process over a period of six months.
BCN_20000 is a dataset composed of 19,424 dermoscopic images of skin lesions captured from 2010 to 2016 in the facilities of the Hospital Clínic in Barcelona. The dataset can be used for lesion recognition tasks such as lesion segmentation, lesion detection and lesion classification.
10 PAPERS • NO BENCHMARKS YET
HyperKvasir dataset contains 110,079 images and 374 videos where it captures anatomical landmarks and pathological and normal findings. A total of around 1 million images and video frames altogether.
10 PAPERS • 2 BENCHMARKS
Kuzushiji-49 is an MNIST-like dataset that has 49 classes (28x28 grayscale, 270,912 images) from 48 Hiragana characters and one Hiragana iteration mark.
MLRSNet is a a multi-label high spatial resolution remote sensing dataset for semantic scene understanding. It provides different perspectives of the world captured from satellites. That is, it is composed of high spatial resolution optical satellite images. MLRSNet contains 109,161 remote sensing images that are annotated into 46 categories, and the number of sample images in a category varies from 1,500 to 3,000. The images have a fixed size of 256×256 pixels with various pixel resolutions (~10m to 0.1m). Moreover, each image in the dataset is tagged with several of 60 predefined class labels, and the number of labels associated with each image varies from 1 to 13. The dataset can be used for multi-label based image classification, multi-label based image retrieval, and image segmentation.
10 PAPERS • 1 BENCHMARK
DeepFish as a benchmark suite with a large-scale dataset to train and test methods for several computer vision tasks. The dataset consists of approximately 40 thousand images collected underwater from 20 habitats in the marine environments of tropical Australia. It contains classification labels as well as point-level and segmentation labels to have a more comprehensive fish analysis benchmark. These labels enable models to learn to automatically monitor fish count, identify their locations, and estimate their sizes.
9 PAPERS • NO BENCHMARKS YET
Food2K is a large food recognition dataset with 2,000 categories and over 1 million images. Compared with existing food recognition datasets, Food2K bypasses them in both categories and images by one order of magnitude, and thus establishes a new challenging benchmark to develop advanced models for food visual representation learning. Food2K can be further explored to benefit more food-relevant tasks including emerging and more complex ones (e.g., nutritional understanding of food), and the trained models on Food2K can be expected as backbones to improve the performance of more food-relevant tasks.
The Breast Cancer Histopathological Image Classification (BreakHis) is composed of 9,109 microscopic images of breast tumor tissue collected from 82 patients using different magnifying factors (40X, 100X, 200X, and 400X). It contains 2,480 benign and 5,429 malignant samples (700X460 pixels, 3-channel RGB, 8-bit depth in each channel, PNG format). This database has been built in collaboration with the P&D Laboratory - Pathological Anatomy and Cytopathology, Parana, Brazil.
8 PAPERS • 5 BENCHMARKS
FoodX-251 is a dataset of 251 fine-grained classes with 118k training, 12k validation and 28k test images. Human verified labels are made available for the training and test images. The classes are fine-grained and visually similar, for example, different types of cakes, sandwiches, puddings, soups, and pastas.
8 PAPERS • 1 BENCHMARK
We introduce ArtBench-10, the first class-balanced, high-quality, cleanly annotated, and standardized dataset for benchmarking artwork generation. It comprises 60,000 images of artwork from 10 distinctive artistic styles, with 5,000 training images and 1,000 testing images per style. ArtBench-10 has several advantages over previous artwork datasets. Firstly, it is class-balanced while most previous artwork datasets suffer from the long tail class distributions. Secondly, the images are of high quality with clean annotations. Thirdly, ArtBench-10 is created with standardized data collection, annotation, filtering, and preprocessing procedures. We provide three versions of the dataset with different resolutions (32×32, 256×256, and original image size), formatted in a way that is easy to be incorporated by popular machine learning frameworks.
7 PAPERS • 1 BENCHMARK
Bamboo Dataset is a mega-scale and information-dense dataset for both classification and detection pre-training. It is built upon integrating 24 public datasets (e.g. ImagenNet, Places365, Object365, OpenImages) and added new annotations through active learning. Bamboo has 69M image classification annotations and 32M object bounding boxes.
7 PAPERS • NO BENCHMARKS YET
A new challenging dataset that can be used for many pattern recognition tasks.
7 PAPERS • 2 BENCHMARKS
The Diabetic Foot Ulcers dataset (DFUC2021) is a dataset for analysis of pathology, focusing on infection and ischaemia. The final release of DFUC2021 consists of 15,683 DFU patches, with 5,955 training, 5,734 for testing and 3,994 unlabeled DFU patches. The ground truth labels are four classes, i.e. control, infection, ischaemia and both conditions.
Grocery Store is a dataset of natural images of grocery items. All natural images were taken with a smartphone camera in different grocery stores. It contains 5,125 natural images from 81 different classes of fruits, vegetables, and carton items (e.g. juice, milk, yoghurt). The 81 classes are divided into 42 coarse-grained classes, where e.g. the fine-grained classes 'Royal Gala' and 'Granny Smith' belong to the same coarse-grained class 'Apple'. Additionally, each fine-grained class has an associated iconic image and a product description of the item.
The Kannada-MNIST dataset is a drop-in substitute for the standard MNIST dataset for the Kannada language.
Consists of faces extracted from pre-modern Japanese artwork.
The NCT-CRC-HE-100K dataset is a set of 100,000 non-overlapping image patches extracted from 86 H$\&$E stained human cancer tissue slides and normal tissue from the NCT biobank (National Center for Tumor Diseases) and the UMM pathology archive (University Medical Center Mannheim). While the dataset Colorectal Cacner-Validation-Histology-7K (CRC-VAL-HE-7K) consist of 7180 images extracted from 50 patients with colorectal adenocarcinoma and were used to create a dataset that does not overlap with patients in the NCT-CRC-HE-100K dataset. It was created by pathologists by manually delineating tissue regions in whole slide images into the following nine tissue classes: Adipose (ADI), background (BACK), debris (DEB), lymphocytes (LYM), mucus (MUC), smooth muscle (MUS), normal colon mucosa (NORM), cancer-associated stroma (STR), colorectal adenocarcinoma epithelium (TUM).
To benchmark Bengali digit recognition algorithms, a large publicly available dataset is required which is free from biases originating from geographical location, gender, and age. With this aim in mind, NumtaDB, a dataset consisting of more than 85,000 images of hand-written Bengali digits, has been assembled.
A synthetic dataset uses for a systematic analysis across common factors of variation.
The iCartoonFace dataset is a large-scale dataset that can be used for two different tasks: cartoon face detection and cartoon face recognition.
Dataset aimed to do automated aerial scene classification of disaster events from on-board a UAV.
6 PAPERS • NO BENCHMARKS YET
This is a dataset with spurious correlations which can be used to evaluate machine learning methods for out-of-distribution generalization, causal inference, and related field.
6 PAPERS • 1 BENCHMARK
F-CelebA - This dataset is adapted from federated learning. Federated learning is an emerging machine learning paradigm with an emphasis on data privacy. The idea is to train through model aggregation rather than conventional data aggregation and keep local data staying on the local device. This dataset naturally consists of similar tasks and each of the 10 tasks contains images of a celebrity labeled by whether he/she is smiling or not. More detailed please check page https://github.com/ZixuanKe/CAT
The Food-101N dataset is introduced in "CleanNet: Transfer Learning for Scalable Image Training with Label Noise (CVPR'18). It is an image dataset containing about 310,009 images of food recipes classified in 101 classes (categories). Food-101N and the Food-101 dataset share the same 101 classes, whereas Food-101N has much more images and is more noisy.
The PS-Battles dataset is gathered from a large community of image manipulation enthusiasts and provides a basis for media derivation and manipulation detection in the visual domain. The dataset consists of 102'028 images grouped into 11'142 subsets, each containing the original image as well as a varying number of manipulated derivatives.
Danish Fungi 2020 (DF20) is a fine-grained dataset and benchmark. The dataset, constructed from observations submitted to the Danish Fungal Atlas, is unique in its taxonomy-accurate class labels, small number of errors, highly unbalanced long-tailed class distribution, rich observation metadata, and well-defined class hierarchy. DF20 has zero overlap with ImageNet, allowing unbiased comparison of models fine-tuned from publicly available ImageNet checkpoints.
5 PAPERS • 1 BENCHMARK
Approx. 300,000 images of galaxies labelled by shape.
5 PAPERS • NO BENCHMARKS YET
ImageNet-9 consists of images with different amounts of background and foreground signal, which you can use to measure the extent to which your models rely on image backgrounds. This dataset is helpful in testing the robustness of vision models with respect to their dependence on the backgrounds of images.
The dataset contains a total of 27,558 cell images with equal instances of parasitized and uninfected cells.
5 PAPERS • 2 BENCHMARKS
Part of the Controlled Noisy Web Labels Dataset.
Tencent ML-Images is a large open-source multi-label image database, including 17,609,752 training and 88,739 validation image URLs, which are annotated with up to 11,166 categories.
The Urban Environments dataset is a dataset of 20 land use classes across 300 European cities paired with satellite imagery data.
AmsterTime dataset offers a collection of 2,500 well-curated images matching the same scene from a street view matched to historical archival image data from Amsterdam city. The image pairs capture the same place with different cameras, viewpoints, and appearances. Unlike existing benchmark datasets, AmsterTime is directly crowdsourced in a GIS navigation platform (Mapillary). In turn, all the matching pairs are verified by a human expert to verify the correct matches and evaluate the human competence in the Visual Place Recognition (VPR) task for further references.
4 PAPERS • 3 BENCHMARKS
CI-MNIST (Correlated and Imbalanced MNIST) is a variant of MNIST dataset with introduced different types of correlations between attributes, dataset features, and an artificial eligibility criterion. For an input image $x$, the label $y \in \{1, 0\}$ indicates eligibility or ineligibility, respectively, given that $x$ is even or odd. The dataset defines the background colors as the protected or sensitive attribute $s \in \{0, 1\}$, where blue denotes the unprivileged group and red denotes the privileged group. The dataset was designed in order to evaluate bias-mitigation approaches in challenging setups and be capable of controlling different dataset configurations.
4 PAPERS • NO BENCHMARKS YET
DiagSet is a histopathological dataset for prostate cancer detection. The proposed dataset consists of over 2.6 million tissue patches extracted from 430 fully annotated scans, 4675 scans with assigned binary diagnosis, and 46 scans with diagnosis given independently by a group of histopathologists.
It includes 47,978 butterfly images with a 4-level label-hierarchy. Hierarchy of labels from the ETHEC dataset across 4 levels: family, sub-family, genus and species. 6 family -> 21 sub-family -> 135 genus -> 561 species
ImageNet-Patch: A Dataset for Benchmarking Machine Learning Robustness against Adversarial Patches
InsPLAD is a Dataset for Power Line Asset Inspection containing 10,607 high-resolution Unmanned Aerial Vehicles colour images. It contains 17 unique power line assets captured from real-world operating power lines. Some of those assets (five, to be precise) are also annotated regarding their conditions. They present the following defects: corrosion (4 of them), broken/missing cap (1 of them), and bird's nest presence (1 of them).
4 PAPERS • 1 BENCHMARK
Context This is image data of Natural Scenes around the world.
4 PAPERS • 2 BENCHMARKS
Kuzushiji-Kanji is an imbalanced dataset of total 3832 Kanji characters (64x64 grayscale, 140,426 images), ranging from 1,766 examples to only a single example per class. Kuzushiji is a Japanese cursive writing style.
The LIMUC dataset is the largest publicly available labeled ulcerative colitis dataset that compromises 11276 images from 564 patients and 1043 colonoscopy procedures. Three experienced gastroenterologists were involved in the annotation process, and all images are labeled according to the Mayo endoscopic score (MES).
The MNIST Large Scale dataset is based on the classic MNIST dataset, but contains large scale variations up to a factor of 16. The motivation behind creating this dataset was to enable testing the ability of different algorithms to learn in the presence of large scale variability and specifically the ability to generalise to new scales not present in the training set over wide scale ranges.
MuMiN is a misinformation graph dataset containing rich social media data (tweets, replies, users, images, articles, hashtags), spanning 21 million tweets belonging to 26 thousand Twitter threads, each of which have been semantically linked to 13 thousand fact-checked claims across dozens of topics, events and domains, in 41 different languages, spanning more than a decade.