Video Object Segmentation with Adaptive Feature Bank and Uncertain-Region Refinement

NeurIPS 2020  ·  Yongqing Liang, Xin Li, Navid Jafari, Qin Chen ·

We propose a new matching-based framework for semi-supervised video object segmentation (VOS). Recently, state-of-the-art VOS performance has been achieved by matching-based algorithms, in which feature banks are created to store features for region matching and classification. However, how to effectively organize information in the continuously growing feature bank remains under-explored, and this leads to inefficient design of the bank. We introduce an adaptive feature bank update scheme to dynamically absorb new features and discard obsolete features. We also design a new confidence loss and a fine-grained segmentation module to enhance the segmentation accuracy in uncertain regions. On public benchmarks, our algorithm outperforms existing state-of-the-arts.

PDF Abstract NeurIPS 2020 PDF NeurIPS 2020 Abstract

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Uses Extra
Training Data
Result Benchmark
Semi-Supervised Video Object Segmentation DAVIS 2017 (val) AFB-URR Jaccard (Mean) 73.0 # 51
Jaccard (Recall) 85.3 # 5
Jaccard (Decay) 13.8 # 7
F-measure (Mean) 76.1 # 53
F-measure (Recall) 87.0 # 7
F-measure (Decay) 15.5 # 4
J&F 74.6 # 54
Semi-Supervised Video Object Segmentation DAVIS (no YouTube-VOS training) AFB-URR FPS 4.00 # 19
D17 val (G) 74.6 # 12
D17 val (J) 73.0 # 9
D17 val (F) 76.1 # 13
Semi-Supervised Video Object Segmentation Long Video Dataset AFB-URR J&F 83.7 # 6
J 82.9 # 3
F 84.5 # 3
Semi-Supervised Video Object Segmentation Long Video Dataset (3X) AFB-URR J&F 83.8 # 2
J 82.9 # 2
F 84.6 # 2
Semi-Supervised Video Object Segmentation YouTube-VOS 2018 AFB-URR F-Measure (Seen) 83.1 # 42
F-Measure (Unseen) 82.6 # 39
Overall 79.6 # 41
Jaccard (Seen) 78.8 # 41
Jaccard (Unseen) 74.1 # 41

Methods