Video Object Segmentation
240 papers with code • 9 benchmarks • 17 datasets
Video object segmentation is a binary labeling problem aiming to separate foreground object(s) from the background region of a video.
For leaderboards please refer to the different subtasks.
Libraries
Use these libraries to find Video Object Segmentation models and implementationsDatasets
Subtasks
Latest papers with no code
Self-supervised Video Object Segmentation with Distillation Learning of Deformable Attention
This is enabled by deformable attention mechanism, where the keys and values capturing the memory of a video sequence in the attention module have flexible locations updated across frames.
Explore Synergistic Interaction Across Frames for Interactive Video Object Segmentation
Interactive Video Object Segmentation (iVOS) is a challenging task that requires real-time human-computer interaction.
Understanding Video Transformers via Universal Concept Discovery
Concretely, we seek to explain the decision-making process of video transformers based on high-level, spatiotemporal concepts that are automatically discovered.
No More Shortcuts: Realizing the Potential of Temporal Self-Supervision
To address these issues, we propose 1) a more challenging reformulation of temporal self-supervision as frame-level (rather than clip-level) recognition tasks and 2) an effective augmentation strategy to mitigate shortcuts.
TAM-VT: Transformation-Aware Multi-scale Video Transformer for Segmentation and Tracking
In this work we propose a novel, clip-based DETR-style encoder-decoder architecture, which focuses on systematically analyzing and addressing aforementioned challenges.
VIDiff: Translating Videos via Multi-Modal Instructions with Diffusion Models
Our model can edit and translate the desired results within seconds based on user instructions.
SimulFlow: Simultaneously Extracting Feature and Identifying Target for Unsupervised Video Object Segmentation
We evaluate our method on several benchmark datasets and achieve state-of-the-art results.
Sketch-based Video Object Segmentation: Benchmark and Analysis
Reference-based video object segmentation is an emerging topic which aims to segment the corresponding target object in each video frame referred by a given reference, such as a language expression or a photo mask.
Learning the What and How of Annotation in Video Object Segmentation
To reduce this annotation cost, in this paper, we propose EVA-VOS, a human-in-the-loop annotation framework for video object segmentation.
ISAR: A Benchmark for Single- and Few-Shot Object Instance Segmentation and Re-Identification
To build spatial AI systems that can quickly be taught about new objects, we need to effectively solve the problem of single-shot object detection, instance segmentation and re-identification.