Semantic Segmentation Modules

Point-wise Spatial Attention

Introduced by Zhao et al. in PSANet: Point-wise Spatial Attention Network for Scene Parsing

Point-wise Spatial Attention (PSA) is a semantic segmentation module. The goal is capture contextual information, especially in the long range, by aggregating information. Through the PSA module, information aggregation is performed as a kind of information flow where we adaptively learn a pixel-wise global attention map for each position from two perspectives to aggregate contextual information over the entire feature map.

The PSA module takes a spatial feature map $\mathbf{X}$ as input. We denote the spatial size of $\mathbf{X}$ as $H \times W$. Through the two branches as illustrated, we generate pixel-wise global attention maps for each position in feature map $\mathbf{X}$ through several convolutional layers.

We aggregate input feature maps based on attention maps to generate new feature representations with the long-range contextual information incorporated, i.e., $\mathbf{Z}_{c}$ from the ‘collect’ branch and $\mathbf{Z}_{d}$ from the ‘distribute’ branch.

We concatenate the new representations $\mathbf{Z}_{c}$ and $\mathbf{Z}_{d}$ and apply a convolutional layer with batch normalization and activation layers for dimension reduction and feature fusion. Then we concatenate the new global contextual feature with the local representation feature $\mathbf{X}$. It is followed by one or several convolutional layers with batch normalization and activation layers to generate the final feature map for following subnetworks.

Source: PSANet: Point-wise Spatial Attention Network for Scene Parsing

Papers


Paper Code Results Date Stars

Tasks


Task Papers Share
Semantic Segmentation 2 28.57%
3D Shape Recognition 1 14.29%
3D Shape Reconstruction 1 14.29%
Virtual Try-on 1 14.29%
Point Cloud Segmentation 1 14.29%
Scene Parsing 1 14.29%

Categories