The MS COCO (Microsoft Common Objects in Context) dataset is a large-scale object detection, segmentation, key-point detection, and captioning dataset. The dataset consists of 328K images.
10,098 PAPERS • 92 BENCHMARKS
PASCAL VOC 2007 is a dataset for image recognition. The twenty object classes that have been selected are:
119 PAPERS • 14 BENCHMARKS
UVO is a new benchmark for open-world class-agnostic object segmentation in videos. Besides shifting the problem focus to the open-world setup, UVO is significantly larger, providing approximately 8 times more videos compared with DAVIS, and 7 times more mask (instance) annotations per video compared with YouTube-VOS and YouTube-VIS. UVO is also more challenging as it includes many videos with crowded scenes and complex background motions. Some highlights of the dataset include:
22 PAPERS • 3 BENCHMARKS
COCO-Mixed dataset includes 897 images with annotations of both known and unknown categories. It contains 2533 unknown objects and 2658 known objects, with original COCO annotations used as labels for known objects. Unambiguous unlabeled objects are also annotated. The dataset is more challenging to evaluate due to the images containing more object instances with complex categories and concentrated locations.
1 PAPER • 1 BENCHMARK
COCO-OOD dataset contains only unknown categories, consisting of 504 images with fine-grained annotations of 1655 unknown objects. All annotations consist of original annotations in COCO and the augmented annotations on the basis of the COCO definition.