Regularization

Shake-Shake Regularization

Introduced by Gastaldi in Shake-Shake regularization

Shake-Shake Regularization aims to improve the generalization ability of multi-branch networks by replacing the standard summation of parallel branches with a stochastic affine combination. A typical pre-activation ResNet with 2 residual branches would follow this equation:

$$x_{i+1} = x_{i} + \mathcal{F}\left(x_{i}, \mathcal{W}_{i}^{\left(1\right)}\right) + \mathcal{F}\left(x_{i}, \mathcal{W}_{i}^{\left(2\right)}\right) $$

Shake-shake regularization introduces a random variable $\alpha_{i}$ following a uniform distribution between 0 and 1 during training:

$$x_{i+1} = x_{i} + \alpha\mathcal{F}\left(x_{i}, \mathcal{W}_{i}^{\left(1\right)}\right) + \left(1-\alpha\right)\mathcal{F}\left(x_{i}, \mathcal{W}_{i}^{\left(2\right)}\right) $$

Following the same logic as for dropout, all $\alpha_{i}$ are set to the expected value of $0.5$ at test time.

Source: Shake-Shake regularization

Papers


Paper Code Results Date Stars

Tasks


Task Papers Share
Image Classification 2 28.57%
General Classification 1 14.29%
Object Detection 1 14.29%
Image Augmentation 1 14.29%
Image Cropping 1 14.29%
Retrieval 1 14.29%

Components


Component Type
🤖 No Components Found You can add them if they exist; e.g. Mask R-CNN uses RoIAlign

Categories