Attention Mechanisms

Global second-order pooling convolutional networks

Introduced by Gao et al. in Global Second-order Pooling Convolutional Networks

A Gsop block has a squeeze module and an excitation module, and uses a second-order pooling to model high-order statistics while gathering global information. In the squeeze module, a GSoP block firstly reduces the number of channels from $c$ to $c'$ ($c' < c$) using a $1 \times 1$ convolution, then computes a $c' \times c'$ covariance matrix for the different channels to obtain their correlation. Next, row-wise normalization is performed on the covariance matrix. Each $(i, j)$ in the normalized covariance matrix explicitly relates channel $i$ to channel $j$.

In the excitation module, a GSoP block performs row-wise convolution to maintain structural information and output a vector. Then a fully-connected layer and a sigmoid function are applied to get a $c$-dimensional attention vector. Finally, it multiplies the input features by the attention vector, as in an SE block. A GSoP block can be formulated as: \begin{align} s = F_\text{gsop}(X, \theta) & = \sigma (W \text{RC}(\text{Cov}(\text{Conv}(X)))) \end{align} \begin{align} Y & = s X \end{align} Here, $\text{Conv}(\cdot)$ reduces the number of channels, $\text{Cov}(\cdot)$ computes the covariance matrix and $\text{RC}(\cdot)$ means row-wise convolution.

Source: Global Second-order Pooling Convolutional Networks

Papers


Paper Code Results Date Stars

Tasks


Task Papers Share
Object Recognition 1 100.00%

Components


Component Type
🤖 No Components Found You can add them if they exist; e.g. Mask R-CNN uses RoIAlign

Categories