🔔 Share your dataset with the ML community!

Filter by Modality (clear)

Filter by Task (clear)

Filter by Language (clear)

49 dataset results for Semantic Segmentation AND Images AND English

MS COCO (Microsoft Common Objects in Context)

The MS COCO (Microsoft Common Objects in Context) dataset is a large-scale object detection, segmentation, key-point detection, and captioning dataset. The dataset consists of 328K images.

10,158 PAPERS • 92 BENCHMARKS

PASCAL VOC 2012 test

SCC Data Set

109 PAPERS • 3 BENCHMARKS

LIP (Look into Person)

The LIP (Look into Person) dataset is a large-scale dataset focusing on semantic understanding of a person. It contains 50,000 images with elaborated pixel-wise annotations of 19 semantic human part labels and 2D human poses with 16 key points. The images are collected from real-world scenarios and the subjects appear with challenging poses and view, heavy occlusions, various appearances and low resolution.

59 PAPERS • 1 BENCHMARK

LoveDA (Remote Sensing Land-Cover Dataset for Domain Adaptive Semantic Segmentation)

5987 high spatial resolution (0.3 m) remote sensing images from Nanjing, Changzhou, and Wuhan Focus on different geographical environments between Urban and Rural Advance both semantic segmentation and domain adaptation tasks Three considerable challenges: Multi-scale objects Complex background samples Inconsistent class distributions

45 PAPERS • 1 BENCHMARK

GID (Gaofen Image Dataset)

Gaofen Image Dataset (GID) is a large-scale land-cover dataset constructed with Gaofen-2 (GF-2) satellite images. This dataset has superiorities over the existing land-cover dataset because of its large coverage, wide distribution, and high spatial resolution. It contains 150 GF-2 images annotated at the pixel level for 5 categories: built-up, farmland, forest, meadow, and water.

24 PAPERS • NO BENCHMARKS YET

DigestPath

Introduced by Da et al. in DigestPath: a Benchmark Dataset with Challenge Review for the Pathological Detection and Segmentation of Digestive-System

22 PAPERS • 1 BENCHMARK

FoodSeg103

FoodSeg103 is a new food image dataset containing 7,118 images. Images are annotated with 104 ingredient classes and each image has an average of 6 ingredient labels and pixel-wise masks. It's provided as a large-scale benchmark for food image segmentation.

14 PAPERS • 1 BENCHMARK

MCubeS

MCubeS (Multimodal Material Segmentation Dataset)

Multimodal material segmentation (MCubeS) dataset contains 500 sets of images from 42 street scenes. Each scene has images for four modalities: RGB, angle of linear polarization (AoLP), degree of linear polarization (DoLP), and near-infrared (NIR). The dataset provides annotated ground truth labels for both material and semantic segmentation for every pixel. The dataset is divided training set with 302 image sets, validation set with 96 image sets, and test set with 102 image sets. Each image has 1224 x 1024 pixels and a total of 20 class labels per pixel.

10 PAPERS • 1 BENCHMARK

FMB Dataset

FMB Dataset (Full-time Multi-modality Benchmark Dataset)

FMB contains 1500 well-registered infrared and visible image pairs with 14 annotated pixel-level categories. Also, it covers a wide range of pixel variations and various severe environments, e.g., dense fog, heavy rain, and low-light condition. The FMB dataset includes rich scenes under different illumination conditions, so that it enables fusion/segmentation model to improve the generalization ability greatly. We labeled 98.16% of all pixels into 14 different categories including Road, Sidewalk, Building, Traffic Light, Traffic Sign, Vegetation, Sky, Person, Car, Truck, Bus, Motorcycle, Bicycle and Pole, which often appear in real world automatic driving and semantic understanding tasks.

9 PAPERS • 1 BENCHMARK

BIMCV COVID-19

BIMCV-COVID19+ dataset is a large dataset with chest X-ray images CXR (CR, DX) and computed tomography (CT) imaging of COVID-19 patients along with their radiographic findings, pathologies, polymerase chain reaction (PCR), immunoglobulin G (IgG) and immunoglobulin M (IgM) diagnostic antibody tests and radiographic reports from Medical Imaging Databank in Valencian Region Medical Image Bank (BIMCV). The findings are mapped onto standard Unified Medical Language System (UMLS) terminology and they cover a wide spectrum of thoracic entities, contrasting with the much more reduced number of entities annotated in previous datasets. Images are stored in high resolution and entities are localized with anatomical labels in a Medical Imaging Data Structure (MIDS) format. In addition, 23 images were annotated by a team of expert radiologists to include semantic segmentation of radiographic findings. Moreover, extensive information is provided, including the patient’s demographic information, type

8 PAPERS • NO BENCHMARKS YET

WaterScenes

A Multi-Task 4D Radar-Camera Fusion Dataset for Autonomous Driving on Water Surfaces description of the dataset

8 PAPERS • 2 BENCHMARKS

SpaceNet 2 (SpaceNet 2: Building Detection v2)

SpaceNet 2: Building Detection v2 - is a dataset for building footprint detection in geographically diverse settings from very high resolution satellite images. It contains over 302,701 building footprints, 3/8-band Worldview-3 satellite imagery at 0.3m pixel res., across 5 cities (Rio de Janeiro, Las Vegas, Paris, Shanghai, Khartoum), and covers areas that are both urban and suburban in nature. The dataset was split using 60%/20%/20% for train/test/validation.

7 PAPERS • 1 BENCHMARK

CC3M-TagMask

The dataset offers tag and mask annotations for image-text pairs from the CC3M validation set. Tag annotations denote words that aptly describe the relationship between the image and the corresponding text. These annotations provide valuable insights into the semantic connection between each pair's visual and textual elements.

5 PAPERS • 2 BENCHMARKS

LabPics (LabPics Dataset for computer vision for autonomous chemistry labs and medical labs)

LabPics Chemistry Dataset

5 PAPERS • NO BENCHMARKS YET

Satlas

Satlas is a remote sensing dataset and benchmark that is large in both breadth, featuring all of the aforementioned applications and more, as well as scale, comprising 290M labels under 137 categories and 7 label modalities.

5 PAPERS • NO BENCHMARKS YET

Sunnybrook Cardiac Data

The Sunnybrook Cardiac Data (SCD), also known as the 2009 Cardiac MR Left Ventricle Segmentation Challenge data, consist of 45 cine-MRI images from a mixed of patients and pathologies: healthy, hypertrophy, heart failure with infarction and heart failure without infarction. Subset of this data set was first used in the automated myocardium segmentation challenge from short-axis MRI, held by a MICCAI workshop in 2009. The whole complete data set is now available in the CAP database with public domain license.

5 PAPERS • NO BENCHMARKS YET

XImageNet-12 (XIMAGENET-12: An Explainable AI Benchmark Dataset for Model Robustness Evaluation)

Enlarge the dataset to understand how image background effect the Computer Vision ML model. With the following topics: Blur Background / Segmented Background / AI generated Background/ Bias of tools during annotation/ Color in Background / Dependent Factor in Background/ LatenSpace Distance of Foreground/ Random Background with Real Environment!

5 PAPERS • 1 BENCHMARK

CropAndWeed Dataset

The CropAndWeed dataset is focused on the fine-grained identification of 74 relevant crop and weed species with a strong emphasis on data variability. Annotations of labeled bounding boxes, semantic masks and stem positions are provided for about 112k instances in more than 8k high-resolution images of both real-world agricultural sites and specifically cultivated outdoor plots of rare weed types. Additionally, each sample is enriched with meta-annotations regarding environmental conditions.

4 PAPERS • NO BENCHMARKS YET

Kvasir-Sessile dataset (Sessile polyps from Kvasir-SEG)

The Kvasir-SEG dataset includes 196 polyps smaller than 10 mm classified as Paris class 1 sessile or Paris class IIa. We have selected it with the help of expert gastroenterologists. We have released this dataset separately as a subset of Kvasir-SEG. We call this subset Kvasir-Sessile.

4 PAPERS • 1 BENCHMARK

SpaceNet 1 (SpaceNet 1: Building Detection v1)

SpaceNet 1: Building Detection v1 is a dataset for building footprint detection. The data is comprised of 382,534 building footprints, covering an area of 2,544 sq. km of 3/8 band WorldView-2 imagery (0.5 m pixel res.) across the city of Rio de Janeiro, Brazil. The images are processed as 200m×200m tiles with associated building footprint vectors for training.

4 PAPERS • 2 BENCHMARKS

Aircraft Context Dataset

The Aircraft Context Dataset, a composition of two inter-compatible large-scale and versatile image datasets focusing on manned aircraft and UAVs, is intended for training and evaluating classification, detection and segmentation models in aerial domains. Additionally, a set of relevant meta-parameters can be used to quantify dataset variability as well as the impact of environmental conditions on model performance.

3 PAPERS • NO BENCHMARKS YET

Five-Billion-Pixels

The Five-Billion-Pixels dataset contains more than 5 billion labeled pixels of 150 high-resolution Gaofen-2 (4 m) satellite images, annotated in a 24-category system covering artificial-constructed, agricultural, and natural classes. It possesses the advantage of rich categories, large coverage, wide distribution, and high-spatial resolution, which well reflects the distributions of real-world ground objects and can benefit to different land cover related studies.

3 PAPERS • NO BENCHMARKS YET

KvasirCapsule-SEG

The dataset contains a Video capsule endoscopy dataset for polyp segmentation.

3 PAPERS • 1 BENCHMARK

Medico automatic polyp segmentation challenge (dataset)

The “Medico automatic polyp segmentation challenge” aims to develop computer-aided diagnosis systems for automatic polyp segmentation to detect all types of polyps (for example, irregular polyp, smaller or flat polyps) with high efficiency and accuracy. The main goal of the challenge is to benchmark semantic segmentation algorithms on a publicly available dataset, emphasizing robustness, speed, and generalization.

3 PAPERS • 1 BENCHMARK

Open Images V7

Open Images is a computer vision dataset covering ~9 million images with labels spanning thousands of object categories. A subset of 1.9M includes diverse annotations types.

3 PAPERS • NO BENCHMARKS YET

PETRAW

PETRAW (PEg TRAnsfer Workflow recognition by different modalities)

PETRAW data set was composed of 150 sequences of peg transfer training sessions. The objective of the peg transfer session is to transfer 6 blocks from the left to the right and back. Each block must be extracted from a peg with one hand, transferred to the other hand, and inserted in a peg at the other side of the board. All cases were acquired by a non-medical expert on the LTSI Laboratory from the University of Rennes. The data set was divided into a training data set composed of 90 cases and a test data set composed of 60 cases. A case was composed of kinematic data, a video, semantic segmentation of each frame, and workflow annotation.

3 PAPERS • 6 BENCHMARKS

Endotect Polyp Segmentation Challenge Dataset

A challenge that consists of three tasks, each targeting a different requirement for in-clinic use. The first task involves classifying images from the GI tract into 23 distinct classes. The second task focuses on efficiant classification measured by the amount of time spent processing each image. The last task relates to automatcially segmenting polyps.

2 PAPERS • 1 BENCHMARK

HERA RFI Detection

HERA RFI Detection (Hydrogen Epoch of Reionization Array (HERA))

This dataset contains simulated and expert-labelled spectrograms from two radio telescopes: the Hydrogen Epoch of Reionization Array (HERA) in South Africa and the Low-Frequency Array (LOFAR) in the Netherlands. These datasets are intended to test radio-frequency interference (RFI) detection schemes. This entry pertains to the HERA dataset specifically.

2 PAPERS • 1 BENCHMARK

HuTics (Human Deictic Gestures Dataset)

HuTics contains 2040 images showing how humans use deictic gestures to interact with various daily-life objects. The images are annotated by segmentation masks of the object(s) of interest. The original purpose of the data collection is for gesture-aware object-agnostic segmentation tasks.

2 PAPERS • NO BENCHMARKS YET

LOFAR RFI Detection

LOFAR RFI Detection (Low-Frequency Array (LOFAR) Radio Frequency Interference Detection)

2 PAPERS • 1 BENCHMARK

Mila Simulated Floods

Mila Simulated Floods Dataset is a 1.5 square km virtual world using the Unity3D game engine including urban, suburban and rural areas.

2 PAPERS • 1 BENCHMARK

OADAT

OADAT (OADAT: Experimental and Synthetic Clinical Optoacoustic Data for Standardized Image Processing)

An experimental and synthetic (simulated) OA raw signals and reconstructed image domain datasets rendered with different experimental parameters and tomographic acquisition geometries.

2 PAPERS • NO BENCHMARKS YET

BCSD

BCSD (Bank Check Segmentation Dataset)

The dataset consists of images of 158 filled out bank checks containing various complex backgrounds, and handwritten text and signatures in the respective fields, along with both pixel-level and patch-level segmentation masks for the signatures on the checks. Please visit the dataset homepage for more details.

1 PAPER • NO BENCHMARKS YET

CheXlocalize

CheXlocalize is a radiologist-annotated segmentation dataset on chest X-rays. The dataset consists of two types of radiologist annotations for the localization of 10 pathologies: pixel-level segmentations and most-representative points. Annotations were drawn on images from the CheXpert validation and test sets. The dataset also consists of two separate sets of radiologist annotations: (1) ground-truth pixel-level segmentations on the validation and test sets, drawn by two board-certified radiologists, and (2) benchmark pixel-level segmentations and most-representative points on the test set, drawn by a separate group of three board-certified radiologists.

1 PAPER • NO BENCHMARKS YET

EBHI-Seg

EBHI-Seg is a dataset containing 5,170 images of six types of tumor differentiation stages and the corresponding ground truth images. The dataset can provide researchers with new segmentation algorithms for medical diagnosis of colorectal cancer.

1 PAPER • NO BENCHMARKS YET

FracAtlas (A Dataset for Fracture Classification, Localization and Segmentation of Musculoskeletal Radiographs)

FractureAtlas is a musculoskeletal bone fracture dataset with annotations for deep learning tasks like classification, localization, and segmentation. The dataset contains a total of 4,083 X-Ray images with annotation in COCO, VGG, YOLO, and Pascal VOC format. This dataset is made freely available for any purpose. The data provided within this work are free to copy, share or redistribute in any medium or format. The data might be adapted, remixed, transformed, and built upon. The dataset is licensed under a CC-BY 4.0 license. It should be noted that to use the dataset correctly, one needs to have knowledge of medical and radiology fields to understand the results and make conclusions based on the dataset. It's also important to consider the possibility of labeling errors.

1 PAPER • NO BENCHMARKS YET

GUISS dataset

GUISS dataset (Meshes, textures, Blend files, stereo datasets, depth maps, depth estimations))

We provide all the expected data inputs to GUISS such as meshes, texture images, and blend files. Generated datasets used in our experiments along with the stereo depth estimations can be downloaded. We have defined seven dataset types: scene_reconstructions, texture_variation, gaea_texture_variation, generative_texture, terrain_variation, rocks, and generative_texture_snow. Each dataset type contains renderings with varying values of different parameters such as lighting angle, texture imgs, albedo, etc. Position each dataset type folder under data/dataset/.

1 PAPER • NO BENCHMARKS YET

LIB-HSI

LIB-HSI (RGB and Hyperspectral images of Building Facades)

The LIB-HSI dataset contains hyperspectral reflectance images and their corresponding RGB images of building façades in a light industrial environment. The dataset also contains pixel-level annotated images for each hyperspectral/RGB image. The LIB-HSI dataset was created to develop deep learning methods for segmenting building facade materials.

1 PAPER • NO BENCHMARKS YET

RUGD

RUGD (RUGD: Robot Unstructured Ground Driving)

A Video Dataset for Visual Perception and Autonomous Navigation in Unstructured Environments. Website: http://rugd.vision/

1 PAPER • 1 BENCHMARK

Risk-Aware Planning Dataset

Risk-Aware Planning is a dataset that contains the overhead images and their semantic segmentation captured by a drone from the CityEnviron environment in AirSim simulator.

1 PAPER • NO BENCHMARKS YET

SBCoseg

SBCoseg (SBCoseg Dataset)

The SBCoseg dataset includes 889 groups of images and each group consists of 18 images with a common object, leading to 16002 images in total. The whole dataset is divided into five subsets: with ECFB, with TR, with MH, with SD, and Normal (normal data). The five subsets contain 193, 251, 82, 83, and 280 image groups, respectively. Each original image is in JPG format with a pixel size of 360 ×360, and each ground-truth image is in PNG format.

1 PAPER • 1 BENCHMARK

Semantic Segmentation Vineyard Rows

Test dataset for Semantic Segmentation. The datasets includes 500 RGB - images with the relative single-channel binary masks. Images are taken from the vineyards in Grugliasco - Turin - Piedmont Region -Italy

1 PAPER • NO BENCHMARKS YET

TransProteus

The dataset contains procedurally generated images of transparent vessels containing liquid and objects . The data for each image includes segmentation maps, 3d depth maps, and normal maps of of the liquid or object inside the transparent vessel, and the vessel. In addition, the properties of the materials inside the containers are given(color/transparency/roughness/metalness). In addition, a natural image benchmark for the 3d/depth estimation of objects inside transparent containers is supplied. 3d models of the objects (GTLF) are also supplied.

1 PAPER • 1 BENCHMARK

UIIS (General Underwater Image Instance Segmentation dataset)

This is the first general Underwater Image Instance Segmentation (UIIS) dataset containing 4,628 images for 7 categories with pixel-level annotations for underwater instance segmentation task

1 PAPER • 1 BENCHMARK

UTFPR-SBD3

The semantic segmentation of clothes is a challenging task due to the wide variety of clothing styles, layers and shapes. The UTFPR-SBD3 contains 4,500 images manually annotated at pixel level in 18 classes plus background. To ensure the high quality of the dataset, all images were manually annotated at the pixel level using JS Segment Annotator, 2 a free web-based image annotation tool. The raw images were carefully selected to avoid, as far as possible, classes with low number of instances.

1 PAPER • 1 BENCHMARK

Corn Seeds Dataset

This dataset is the images of corn seeds considering the top and bottom view independently (two images for one corn seed: top and bottom). There are four classes of the corn seed (Broken-B, Discolored-D, Silkcut-S, and Pure-P) 17802 images are labeled by the experts at the AdTech Corp. and 26K images were unlabeled out of which 9k images were labeled using the Active Learning (BatchBALD)

0 PAPER • NO BENCHMARKS YET

Lemons quality control dataset

Lemon dataset has been prepared to investigate the possibilities to tackle the issue of fruit quality control. It contains 2690 annotated images (1056 x 1056 pixels). Raw lemon images have been captured using the procedure described in the following blogpost and manually annotated using CVAT.

0 PAPER • NO BENCHMARKS YET

MVP-24K

MVP-24K (Multi-grained Vehicle Parsing dataset)

Multi-grained Vehicle Parsing (MVP) is a large-scale dataset for semantic analysis of vehicles in the wild, which has several featured properties. 1. The MVP contains 24,000 vehicle images captured in read-world surveillance scenes, which makes it more scalable for real applications. 2. For different requirements, we annotate the vehicle images with pixel-level part masks in two granularities, i.e., the coarse annotations of ten classes and the fine annotations of 59 classes. The former can be applied to object-level applications such as vehicle Re-Id, fine-grained classification, and pose estimation, while the latter can be explored for high-quality image generation and content manipulation. 3. The images reflect the complexity of real surveillance scenes, such as different viewpoints, illumination conditions, backgrounds, and etc. In addition, the vehicles have diverse countries, types, brands, models, and colors, which makes the dataset more diverse and challenging.

0 PAPER • NO BENCHMARKS YET

Datasets

49 dataset results for Semantic Segmentation AND Images AND English