VOS is a type of video object segmentation model consisting of two network components. The target appearance model consists of a light-weight module, which is learned during the inference stage using fast optimization techniques to predict a coarse but robust target segmentation. The segmentation model is exclusively trained offline, designed to process the coarse scores into high quality segmentation masks.
Source: Learning Fast and Robust Target Models for Video Object SegmentationPaper | Code | Results | Date | Stars |
---|
Task | Papers | Share |
---|---|---|
Video Object Segmentation | 81 | 21.60% |
Semantic Segmentation | 80 | 21.33% |
Video Semantic Segmentation | 79 | 21.07% |
Semi-Supervised Video Object Segmentation | 32 | 8.53% |
Optical Flow Estimation | 10 | 2.67% |
Unsupervised Video Object Segmentation | 7 | 1.87% |
One-shot visual object segmentation | 7 | 1.87% |
Visual Object Tracking | 6 | 1.60% |
Object Detection | 6 | 1.60% |
Component | Type |
|
---|---|---|
🤖 No Components Found | You can add them if they exist; e.g. Mask R-CNN uses RoIAlign |