Scene Segmentation with Dual Relation-aware Attention Network

In this article, we propose a Dual Relation-aware Attention Network (DRANet) to handle the task of scene segmentation. How to efficiently exploit context is essential for pixel-level recognition. To address the issue, we adaptively capture contextual information based on the relation-aware attention mechanism. Especially, we append two types of attention modules on the top of the dilated fully convolutional network (FCN), which model the contextual dependencies in spatial and channel dimensions, respectively. In the attention modules, we adopt a self-attention mechanism to model semantic associations between any two pixels or channels. Each pixel or channel can adaptively aggregate context from all pixels or channels according to their correlations. To reduce the high cost of computation and memory caused by the abovementioned pairwise association computation, we further design two types of compact attention modules. In the compact attention modules, each pixel or channel is built into association only with a few numbers of gathering centers and obtains corresponding context aggregation over these gathering centers. Meanwhile, we add a cross-level gating decoder to selectively enhance spatial details that boost the performance of the network. We conduct extensive experiments to validate the effectiveness of our network and achieve new state-of-the-art segmentation performance on four challenging scene segmentation data sets, i.e., Cityscapes, ADE20K, PASCAL Context, and COCO Stuff data sets. In particular, a Mean IoU score of 82.9% on the Cityscapes test set is achieved without using extra coarse annotated data.

PDF
Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Semantic Segmentation ADE20K DRAN(ResNet-101) Validation mIoU 46.18 # 173
Semantic Segmentation Cityscapes test DRAN(ResNet-101) WITH ONLY FINE ANNOTATED DATA Mean IoU (class) 82.9% # 23
Semantic Segmentation COCO-Stuff test DRAN(ResNet-101) mIoU 41.2% # 8
Semantic Segmentation PASCAL Context DRAN(ResNet-101) mIoU 55.4% # 26

Methods


No methods listed for this paper. Add relevant methods here