The MS COCO (Microsoft Common Objects in Context) dataset is a large-scale object detection, segmentation, key-point detection, and captioning dataset. The dataset consists of 328K images.
3,806 PAPERS • 58 BENCHMARKS
The NUS-WIDE dataset contains 269,648 images with a total of 5,018 tags collected from Flickr. These images are manually annotated with 81 concepts, including objects and scenes.
195 PAPERS • 3 BENCHMARKS
The PASCAL Visual Object Classes (VOC) 2012 dataset contains 20 object categories including vehicles, household, animals, and other: aeroplane, bicycle, boat, bus, car, motorbike, train, bottle, chair, dining table, potted plant, sofa, TV/monitor, bird, cat, cow, dog, horse, sheep, and person. Each image in this dataset has pixel-level segmentation annotations, bounding box annotations, and object class annotations. This dataset has been widely used as a benchmark for object detection, semantic segmentation, and classification tasks. The PASCAL VOC dataset is split into three subsets: 1,464 images for training, 1,449 images for validation and a private testing set.
158 PAPERS • 35 BENCHMARKS
The CheXpert dataset contains 224,316 chest radiographs of 65,240 patients with both frontal and lateral views available. The task is to do automated chest x-ray interpretation, featuring uncertainty labels and radiologist-labeled reference standard evaluation sets.
122 PAPERS • 1 BENCHMARK
ChestX-ray14 is a medical imaging dataset which comprises 112,120 frontal-view X-ray images of 30,805 (collected from the year of 1992 to 2015) unique patients with the text-mined fourteen common disease labels, mined from the text radiological reports via NLP techniques. It expands on ChestX-ray8 by adding six additional thorax diseases: Edema, Emphysema, Fibrosis, Pleural Thickening and Hernia.
101 PAPERS • 3 BENCHMARKS
PASCAL VOC 2007 is a dataset for image recognition. The twenty object classes that have been selected are:
85 PAPERS • 10 BENCHMARKS
CAL500 (Computer Audition Lab 500) is a dataset aimed for evaluation of music information retrieval systems. It consists of 502 songs picked from western popular music. The audio is represented as a time series of the first 13 Mel-frequency cepstral coefficients (and their first and second derivatives) extracted by sliding a 12 ms half-overlapping short-time window over the waveform of each song. Each song has been annotated by at least 3 people with 135 musically-relevant concepts spanning six semantic categories:
16 PAPERS • NO BENCHMARKS YET
LSHTC is a dataset for large-scale text classification. The data used in the LSHTC challenges originates from two popular sources: the DBpedia and the ODP (Open Directory Project) directory, also known as DMOZ. DBpedia instances were selected from the english, non-regional Extended Abstracts provided by the DBpedia site. The DMOZ instances consist of either Content vectors, Description vectors or both. A Content vectors is obtained by directly indexing the web page using standard indexing chain (preprocessing, stemming/lemmatization, stop-word removal).
13 PAPERS • NO BENCHMARKS YET
The friedman1 data set is commonly used to test semi-supervised regression methods.
11 PAPERS • NO BENCHMARKS YET
Arxiv ASTRO-PH (Astro Physics) collaboration network is from the e-print arXiv and covers scientific collaborations between authors papers submitted to Astro Physics category. If an author i co-authored a paper with author j, the graph contains a undirected edge from i to j. If the paper is co-authored by k authors this generates a completely connected (sub)graph on k nodes.
9 PAPERS • 2 BENCHMARKS
MIMIC-CXR from Massachusetts Institute of Technology presents 371,920 chest X-rays associated with 227,943 imaging studies from 65,079 patients. The studies were performed at Beth Israel Deaconess Medical Center in Boston, MA.
7 PAPERS • 1 BENCHMARK
ECHR is an English legal judgment prediction dataset of cases from the European Court of Human Rights (ECHR). The dataset contains ~11.5k cases, including the raw text.
4 PAPERS • NO BENCHMARKS YET
This dataset contains Bangla handwritten numerals, basic characters and compound characters. This dataset was collected from multiple geographical location within Bangladesh and includes sample collected from a variety of aged groups. This dataset can also be used for other classification problems i.e: gender, age, district.
3 PAPERS • 2 BENCHMARKS
CHiME-Home is a dataset for sound source recognition in a domestic environment. It uses around 6.8 hours of domestic environment audio recordings. The recordings were obtained from the CHiME projects – computational hearing in multisource environments – where recording equipment was positioned inside an English Victorian semi-detached house. The recordings were selected from 22 sessions totalling 19.5 hours, with each session made between 7:30 in the morning and 20:00 in the evening. In the considered recordings, the equipment was placed in the lounge (sitting room) near the door opening onto a hallway, with the hallway opening onto a kitchen with no door. With the lounge door typically open, prominent sounds thus may originate from sources both in the lounge and kitchen.
3 PAPERS • NO BENCHMARKS YET
Moviescope is a large-scale dataset of 5,000 movies with corresponding video trailers, posters, plots and metadata. Moviescope is based on the IMDB 5000 dataset consisting of 5.043 movie records. It is augmented by crawling video trailers associated with each movie from YouTube and text plots from Wikipedia.
2 PAPERS • NO BENCHMARKS YET
Aims to help V-NLIs recognize analytic tasks from free-form natural language by training and evaluating cutting-edge multi-label classification models. The dataset contains diverse user queries, and each is annotated with one or multiple analytic tasks.
1 PAPER • NO BENCHMARKS YET
ScienceExamCER is a collection of resources for studying explanation-centered inference, including explanation graphs for 1,680 questions, with 4,950 tablestore rows, and other analyses of the knowledge required to answer elementary and middle-school science questions.
1 PAPER • NO BENCHMARKS YET
The KIT Whole-Body Human Motion Database is a large-scale dataset of whole-body human motion with methods and tools, which allows a unifying representation of captured human motion, and efficient search in the database, as well as the transfer of subject-specific motions to robots with different embodiments. Captured subject-specific motion is normalized regarding the subject’s height and weight by using a reference kinematics and dynamics model of the human body, the master motor map (MMM). In contrast with previous approaches and human motion databases, the motion data in this database consider not only the motions of the human subject but the position and motion of objects with which the subject is interacting as well. In addition to the description of the MMM reference model, See the paper for procedures and techniques used for the systematic recording, labeling, and organization of human motion capture data, object motions as well as the subject–object relations.
0 PAPER • NO BENCHMARKS YET