Semantic Segmentation
5184 papers with code • 125 benchmarks • 311 datasets
Semantic Segmentation is a computer vision task in which the goal is to categorize each pixel in an image into a class or object. The goal is to produce a dense pixel-wise segmentation map of an image, where each pixel is assigned to a specific class or object. Some example benchmarks for this task are Cityscapes, PASCAL VOC and ADE20K. Models are usually evaluated with the Mean Intersection-Over-Union (Mean IoU) and Pixel Accuracy metrics.
( Image credit: CSAILVision )
Libraries
Use these libraries to find Semantic Segmentation models and implementationsSubtasks
- Tumor Segmentation
- Panoptic Segmentation
- 3D Semantic Segmentation
- Weakly-Supervised Semantic Segmentation
- Weakly-Supervised Semantic Segmentation
- Scene Segmentation
- Semi-Supervised Semantic Segmentation
- Real-Time Semantic Segmentation
- 3D Part Segmentation
- Unsupervised Semantic Segmentation
- Road Segmentation
- One-Shot Segmentation
- Bird's-Eye View Semantic Segmentation
- Crack Segmentation
- UNET Segmentation
- Universal Segmentation
- Class-Incremental Semantic Segmentation
- Polyp Segmentation
- Vision-Language Segmentation
- 4D Spatio Temporal Semantic Segmentation
- Histopathological Segmentation
- Attentive segmentation networks
- Text-Line Extraction
- Aerial Video Semantic Segmentation
- Amodal Panoptic Segmentation
- Robust BEV Map Segmentation
Most implemented papers
SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation
We show that SegNet provides good performance with competitive inference time and more efficient inference memory-wise as compared to other architectures.
Swin Transformer: Hierarchical Vision Transformer using Shifted Windows
This paper presents a new vision Transformer, called Swin Transformer, that capably serves as a general-purpose backbone for computer vision.
Pyramid Scene Parsing Network
Scene parsing is challenging for unrestricted open vocabulary and diverse scenes.
PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space
By exploiting metric space distances, our network is able to learn local features with increasing contextual scales.
Searching for MobileNetV3
We achieve new state of the art results for mobile classification, detection and segmentation.
Fully Convolutional Networks for Semantic Segmentation
Convolutional networks are powerful visual models that yield hierarchies of features.
ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation
The ability to perform pixel-wise semantic segmentation in real-time is of paramount importance in mobile applications.
Masked Autoencoders Are Scalable Vision Learners
Our MAE approach is simple: we mask random patches of the input image and reconstruct the missing pixels.
YOLACT: Real-time Instance Segmentation
Then we produce instance masks by linearly combining the prototypes with the mask coefficients.
DeepLab: Semantic Image Segmentation with Deep Convolutional Nets, Atrous Convolution, and Fully Connected CRFs
ASPP probes an incoming convolutional feature layer with filters at multiple sampling rates and effective fields-of-views, thus capturing objects as well as image context at multiple scales.