Global second-order pooling convolutional networks

Introduced by Gao et al. in Global Second-order Pooling Convolutional Networks

A Gsop block has a squeeze module and an excitation module, and uses a second-order pooling to model high-order statistics while gathering global information. In the squeeze module, a GSoP block firstly reduces the number of channels from $c$ to $c'$ ($c' < c$) using a $1 \times 1$ convolution, then computes a $c' \times c'$ covariance matrix for the different channels to obtain their correlation. Next, row-wise normalization is performed on the covariance matrix. Each $(i, j)$ in the normalized covariance matrix explicitly relates channel $i$ to channel $j$.

In the excitation module, a GSoP block performs row-wise convolution to maintain structural information and output a vector. Then a fully-connected layer and a sigmoid function are applied to get a $c$-dimensional attention vector. Finally, it multiplies the input features by the attention vector, as in an SE block. A GSoP block can be formulated as: \begin{align} s = F_\text{gsop}(X, \theta) & = \sigma (W \text{RC}(\text{Cov}(\text{Conv}(X)))) \end{align} \begin{align} Y & = s X \end{align} Here, $\text{Conv}(\cdot)$ reduces the number of channels, $\text{Cov}(\cdot)$ computes the covariance matrix and $\text{RC}(\cdot)$ means row-wise convolution.

Source: Global Second-order Pooling Convolutional Networks

Read Paper See Code

Papers

Paper	Code	Results	Date	Stars

Tasks

Task	Papers	Share
Object Recognition	1	100.00%

Usage Over Time

This feature is experimental; we are continuously improving our matching algorithm.

Components

Component	Type	Add Remove
🤖 No Components Found	You can add them if they exist; e.g. Mask R-CNN uses RoIAlign

Categories

Add Remove

Attention Mechanisms