Semi-Supervised Video Object Segmentation
95 papers with code • 15 benchmarks • 13 datasets
The semi-supervised scenario assumes the user inputs a full mask of the object(s) of interest in the first frame of a video sequence. Methods have to produce the segmentation mask for that object(s) in the subsequent frames.
Libraries
Use these libraries to find Semi-Supervised Video Object Segmentation models and implementationsDatasets
Latest papers
Towards Robust Video Object Segmentation with Adaptive Object Calibration
We consolidate this conditional mask calibration process in a progressive manner, where the object representations and proto-masks evolve to be discriminative iteratively.
Recurrent Dynamic Embedding for Video Object Segmentation
In this paper, we propose a Recurrent Dynamic Embedding (RDE) to build a memory bank of constant size.
Boosting Video Object Segmentation based on Scale Inconsistency
We present a refinement framework to boost the performance of pre-trained semi-supervised video object segmentation (VOS) models.
Adaptive Memory Management for Video Object Segmentation
Matching-based networks have achieved state-of-the-art performance for video object segmentation (VOS) tasks by storing every-k frames in an external memory bank for future inference.
Scalable Video Object Segmentation with Identification Mechanism
This paper delves into the challenges of achieving scalable and effective multi-object modeling for semi-supervised Video Object Segmentation (VOS).
MixFormer: End-to-End Tracking with Iterative Mixed Attention
Our core design is to utilize the flexibility of attention operations, and propose a Mixed Attention Module (MAM) for simultaneous feature extraction and target information integration.
Siamese Network with Interactive Transformer for Video Object Segmentation
Semi-supervised video object segmentation (VOS) refers to segmenting the target object in remaining frames given its annotation in the first frame, which has been actively studied in recent years.
Reliable Propagation-Correction Modulation for Video Object Segmentation
We introduce two modulators, propagation and correction modulators, to separately perform channel-wise re-calibration on the target frame embeddings according to local temporal correlations and reliable references respectively.
FAMINet: Learning Real-time Semi-supervised Video Object Segmentation with Steepest Optimized Optical Flow
A FAMINet, which consists of a feature extraction network (F), an appearance network (A), a motion network (M), and an integration network (I), is proposed in this study to address the abovementioned problem.
Dense Unsupervised Learning for Video Segmentation
On established VOS benchmarks, our approach exceeds the segmentation accuracy of previous work despite using significantly less training data and compute power.