Video Object Segmentation

240 papers with code • 9 benchmarks • 17 datasets

Video object segmentation is a binary labeling problem aiming to separate foreground object(s) from the background region of a video.

For leaderboards please refer to the different subtasks.

Benchmarks

Add a Result

These leaderboards are used to track progress in Video Object Segmentation

Dataset	Best Model	Compare
DAVIS 2016	ISVOS (BL30K, MS)	See all
DAVIS 2017 (val)	XMem (BLK30K, MS)	See all
YouTube-VOS 2018	XMem (BL30K, MS)	See all
DAVIS 2017 (test-dev)	BATMAN	See all
YouTube-VOS 2019	XMem (BL30K,MS)	See all
DAVIS 2017	AOC-MF (val)	See all
FBMS	Ours	See all
DAVIS-2017 (test-dev)	XMem (BL30K, MS)	See all
YouTube	Ours	See all

Libraries

Use these libraries to find Video Object Segmentation models and implementations

yoxu515/aot-benchmark

4 papers

560

visionml/pytracking

3 papers

3,080

hkchengrex/Mask-Propagation

3 papers

124

z-x-yang/AOT

3 papers

116

Datasets

Subtasks

Video Salient Object Detection

Interactive Video Object Segmentation

Long-tail Video Object Segmentation

Latest papers with no code

Most implemented Social Latest No code

Self-supervised Video Object Segmentation with Distillation Learning of Deformable Attention

no code yet • 25 Jan 2024

This is enabled by deformable attention mechanism, where the keys and values capturing the memory of a video sequence in the attention module have flexible locations updated across frames.

Paper
Add Code

Explore Synergistic Interaction Across Frames for Interactive Video Object Segmentation

no code yet • 23 Jan 2024

Interactive Video Object Segmentation (iVOS) is a challenging task that requires real-time human-computer interaction.

Paper
Add Code

Understanding Video Transformers via Universal Concept Discovery

no code yet • 19 Jan 2024

Concretely, we seek to explain the decision-making process of video transformers based on high-level, spatiotemporal concepts that are automatically discovered.

Paper
Add Code

No More Shortcuts: Realizing the Potential of Temporal Self-Supervision

no code yet • 20 Dec 2023

To address these issues, we propose 1) a more challenging reformulation of temporal self-supervision as frame-level (rather than clip-level) recognition tasks and 2) an effective augmentation strategy to mitigate shortcuts.

Paper
Add Code

TAM-VT: Transformation-Aware Multi-scale Video Transformer for Segmentation and Tracking

no code yet • 13 Dec 2023

In this work we propose a novel, clip-based DETR-style encoder-decoder architecture, which focuses on systematically analyzing and addressing aforementioned challenges.

Paper
Add Code

VIDiff: Translating Videos via Multi-Modal Instructions with Diffusion Models

no code yet • 30 Nov 2023

Our model can edit and translate the desired results within seconds based on user instructions.

Paper
Add Code

SimulFlow: Simultaneously Extracting Feature and Identifying Target for Unsupervised Video Object Segmentation

no code yet • 30 Nov 2023

We evaluate our method on several benchmark datasets and achieve state-of-the-art results.

Paper
Add Code

Sketch-based Video Object Segmentation: Benchmark and Analysis

no code yet • 13 Nov 2023

Reference-based video object segmentation is an emerging topic which aims to segment the corresponding target object in each video frame referred by a given reference, such as a language expression or a photo mask.

Paper
Add Code

Learning the What and How of Annotation in Video Object Segmentation

no code yet • 8 Nov 2023

To reduce this annotation cost, in this paper, we propose EVA-VOS, a human-in-the-loop annotation framework for video object segmentation.

Paper
Add Code

ISAR: A Benchmark for Single- and Few-Shot Object Instance Segmentation and Re-Identification

no code yet • 5 Nov 2023

To build spatial AI systems that can quickly be taught about new objects, we need to effectively solve the problem of single-shot object detection, instance segmentation and re-identification.

Paper
Add Code

Video Object Segmentation

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Latest papers with no code

Content

Benchmarks

Add a Result