BiSeNet: Bilateral Segmentation Network for Real-time Semantic Segmentation

Semantic segmentation requires both rich spatial information and sizeable receptive field. However, modern approaches usually compromise spatial resolution to achieve real-time inference speed, which leads to poor performance. In this paper, we address this dilemma with a novel Bilateral Segmentation Network (BiSeNet). We first design a Spatial Path with a small stride to preserve the spatial information and generate high-resolution features. Meanwhile, a Context Path with a fast downsampling strategy is employed to obtain sufficient receptive field. On top of the two paths, we introduce a new Feature Fusion Module to combine features efficiently. The proposed architecture makes a right balance between the speed and segmentation performance on Cityscapes, CamVid, and COCO-Stuff datasets. Specifically, for a 2048x1024 input, we achieve 68.4% Mean IOU on the Cityscapes test dataset with speed of 105 FPS on one NVIDIA Titan XP card, which is significantly faster than the existing methods with comparable performance.

PDF Abstract ECCV 2018 PDF ECCV 2018 Abstract
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Semantic Segmentation CamVid BiSeNet Mean IoU 68.7% # 8
Real-Time Semantic Segmentation CamVid BiSeNet mIoU 68.7% # 20
Real-Time Semantic Segmentation Cityscapes test BiSeNet mIoU 74.7% # 17
Frame (fps) 65.5 # 10
Semantic Segmentation Cityscapes test BiSeNet (ResNet-101) Mean IoU (class) 78.9% # 57
Real-Time Semantic Segmentation Cityscapes test BiSeNet(Xception39) mIoU 68.4% # 31
Time (ms) 9.5 # 4
Frame (fps) 105.8 # 6
Real-Time Semantic Segmentation Cityscapes test BiSeNet(ResNet-18) mIoU 74.7% # 17
Time (ms) 15.2 # 10
Frame (fps) 65.5 # 10
Dichotomous Image Segmentation DIS-TE1 BSV1 max F-Measure 0.595 # 21
weighted F-measure 0.474 # 20
MAE 0.108 # 20
S-Measure 0.695 # 20
E-measure 0.741 # 20
HCE 288 # 20
Dichotomous Image Segmentation DIS-TE2 BSV1 max F-Measure 0.680 # 20
weighted F-measure 0.564 # 20
MAE 0.111 # 20
S-Measure 0.740 # 18
E-measure 0.781 # 19
HCE 621 # 20
Dichotomous Image Segmentation DIS-TE3 BSV1 max F-Measure 0.710 # 20
weighted F-measure 0.595 # 20
MAE 0.109 # 20
S-Measure 0.757 # 17
E-measure 0.801 # 19
HCE 1146 # 20
Dichotomous Image Segmentation DIS-TE4 BSV1 max F-Measure 0.710 # 19
weighted F-measure 0.598 # 20
MAE 0.114 # 20
S-Measure 0.755 # 15
E-measure 0.788 # 20
HCE 3999 # 20
Dichotomous Image Segmentation DIS-VD BSV1 max F-Measure 0.662 # 19
weighted F-measure 0.548 # 19
MAE 0.116 # 19
S-Measure 0.728 # 17
E-measure 0.767 # 19
HCE 1660 # 20
Semantic Segmentation SkyScapes-Dense BiSeNet (ResNet-50) Mean IoU 30.82 # 4
Semantic Segmentation Trans10K BiSeNet mIoU 58.40% # 12
GFLOPs 19.91 # 3

Methods