Video Object Segmentation
242 papers with code • 9 benchmarks • 17 datasets
Video object segmentation is a binary labeling problem aiming to separate foreground object(s) from the background region of a video.
For leaderboards please refer to the different subtasks.
Libraries
Use these libraries to find Video Object Segmentation models and implementationsDatasets
Subtasks
Latest papers with no code
360VOTS: Visual Object Tracking and Segmentation in Omnidirectional Videos
Visual object tracking and segmentation in omnidirectional videos are challenging due to the wide field-of-view and large spherical distortion brought by 360{\deg} images.
Spatial-Temporal Multi-level Association for Video Object Segmentation
In addition, we propose a spatial-temporal memory to assist feature association and temporal ID assignment and correlation.
Event-assisted Low-Light Video Object Segmentation
In the realm of video object segmentation (VOS), the challenge of operating under low-light conditions persists, resulting in notably degraded image quality and compromised accuracy when comparing query and memory frames for similarity computation.
Annolid: Annotate, Segment, and Track Anything You Need
Annolid is a deep learning-based software package designed for the segmentation, labeling, and tracking of research targets within video files, focusing primarily on animal behavior analysis.
OneVOS: Unifying Video Object Segmentation with All-in-One Transformer Framework
Contemporary Video Object Segmentation (VOS) approaches typically consist stages of feature extraction, matching, memory management, and multiple objects aggregation.
Real-time Surgical Instrument Segmentation in Video Using Point Tracking and Segment Anything
Inspired by this progress, we present a novel framework that combines an online point tracker with a lightweight SAM model that is fine-tuned for surgical instrument segmentation.
ClickVOS: Click Video Object Segmentation
To address these limitations, we propose the setting named Click Video Object Segmentation (ClickVOS) which segments objects of interest across the whole video according to a single click per object in the first frame.
Depth-aware Test-Time Training for Zero-shot Video Object Segmentation
In this work, we introduce a test-time training (TTT) strategy to address the problem.
Moving Object Proposals with Deep Learned Optical Flow for Video Object Segmentation
Then we render the output of optical flow net to a fully convolutional SegNet model.
Point-VOS: Pointing Up Video Object Segmentation
We propose a novel Point-VOS task with a spatio-temporally sparse point-wise annotation scheme that substantially reduces the annotation effort.