Fast Video Object Segmentation by Reference-Guided Mask Propagation

We present an efficient method for the semi-supervised video object segmentation. Our method achieves accuracy competitive with state-of-the-art methods while running in a fraction of time compared to others. To this end, we propose a deep Siamese encoder-decoder network that is designed to take advantage of mask propagation and object detection while avoiding the weaknesses of both approaches. Our network, learned through a two-stage training process that exploits both synthetic and real data, works robustly without any online learning or post-processing. We validate our method on four benchmark sets that cover single and multiple object segmentation. On all the benchmark sets, our method shows comparable accuracy while having the order of magnitude faster runtime. We also provide extensive ablation and add-on studies to analyze and evaluate our framework.

PDF Abstract
Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Semi-Supervised Video Object Segmentation DAVIS 2016 RGMP Jaccard (Mean) 81.5 # 59
Jaccard (Recall) 91.7 # 20
Jaccard (Decay) 10.9 # 11
F-measure (Mean) 82.0 # 57
F-measure (Recall) 90.8 # 15
F-measure (Decay) 10.1 # 12
J&F 81.75 # 57
Semi-Supervised Video Object Segmentation DAVIS 2017 (test-dev) RGMP J&F 52.8 # 50
Jaccard (Mean) 51.3 # 49
Jaccard (Recall) 59.0 # 12
Jaccard (Decay) 34.3 # 21
F-measure (Mean) 54.4 # 52
F-measure (Recall) 61.9 # 14
F-measure (Decay) 37.2 # 21
Semi-Supervised Video Object Segmentation DAVIS 2017 (val) RGMP Jaccard (Mean) 64.8 # 60
Jaccard (Recall) 74.1 # 15
Jaccard (Decay) 18.9 # 16
F-measure (Mean) 68.6 # 64
F-measure (Recall) 77.7 # 15
F-measure (Decay) 19.6 # 12
J&F 66.7 # 63

Methods


No methods listed for this paper. Add relevant methods here