Semi-Supervised Video Object Segmentation
94 papers with code • 15 benchmarks • 13 datasets
The semi-supervised scenario assumes the user inputs a full mask of the object(s) of interest in the first frame of a video sequence. Methods have to produce the segmentation mask for that object(s) in the subsequent frames.
Libraries
Use these libraries to find Semi-Supervised Video Object Segmentation models and implementationsDatasets
Most implemented papers
Learning Video Object Segmentation from Static Images
Inspired by recent advances of deep learning in instance segmentation and object tracking, we introduce video object segmentation problem as a concept of guided instance segmentation.
Fast Video Object Segmentation by Reference-Guided Mask Propagation
We validate our method on four benchmark sets that cover single and multiple object segmentation.
RANet: Ranking Attention Network for Fast Video Object Segmentation
Specifically, to integrate the insights of matching based and propagation based methods, we employ an encoder-decoder framework to learn pixel-level similarity and segmentation in an end-to-end manner.
Joint-task Self-supervised Learning for Temporal Correspondence
Our learning process integrates two highly related tasks: tracking large image regions \emph{and} establishing fine-grained pixel-level associations between consecutive video frames.
MAST: A Memory-Augmented Self-supervised Tracker
Recent interest in self-supervised dense tracking has yielded rapid progress, but performance still remains far from supervised methods.
Learning Fast and Robust Target Models for Video Object Segmentation
The target appearance model consists of a light-weight module, which is learned during the inference stage using fast optimization techniques to predict a coarse but robust target segmentation.
Collaborative Video Object Segmentation by Foreground-Background Integration
This paper investigates the principles of embedding learning to tackle the challenging semi-supervised video object segmentation.
Learning What to Learn for Video Object Segmentation
This allows us to achieve a rich internal representation of the target in the current frame, significantly increasing the segmentation accuracy of our approach.
Associating Objects with Transformers for Video Object Segmentation
The state-of-the-art methods learn to decode features with a single positive object and thus have to match and segment each target separately under multi-object scenarios, consuming multiple times computing resources.