Methods > General > Regularization

Stochastic Depth

Introduced by Huang et al. in Deep Networks with Stochastic Depth

Stochastic Depth aims to shrink the depth of a network during training, while keeping it unchanged during testing. This is achieved by randomly dropping entire ResBlocks during training and bypassing their transformations through skip connections.

Let $b_{l} \in$ {$0, 1$} denote a Bernoulli random variable, which indicates whether the $l$th ResBlock is active ($b_{l} = 1$) or inactive ($b_{l} = 0$). Further, let us denote the “survival” probability of ResBlock $l$ as $p_{l} = \text{Pr}\left(b_{l} = 1\right)$. With this definition we can bypass the $l$th ResBlock by multiplying its function $f_{l}$ with $b_{l}$ and we extend the update rule to:

$$ H_{l} = \text{ReLU}\left(b_{l}f_{l}\left(H_{l-1}\right) + \text{id}\left(H_{l-1}\right)\right) $$

If $b_{l} = 1$, this reduces to the original ResNet update and this ResBlock remains unchanged. If $b_{l} = 0$, the ResBlock reduces to the identity function, $H_{l} = \text{id}\left((H_{l}−1\right)$.

Source: Deep Networks with Stochastic Depth

Latest Papers

PAPER DATE
Intermediate Loss Regularization for CTC-based Speech Recognition
Jaesong LeeShinji Watanabe
2021-02-05
Training data-efficient image transformers & distillation through attention
| Hugo TouvronMatthieu CordMatthijs DouzeFrancisco MassaAlexandre SablayrollesHervé Jégou
2020-12-23
A new semi-supervised self-training method for lung cancer prediction
Kelvin ShakMundher Al-ShabiAndrea LiewBoon Leong LanWai Yee ChanKwan Hoong NgMaxine Tan
2020-12-17
Simple Copy-Paste is a Strong Data Augmentation Method for Instance Segmentation
| Golnaz GhiasiYin CuiAravind SrinivasRui QianTsung-Yi LinEkin D. CubukQuoc V. LeBarret Zoph
2020-12-13
Semi-Supervised Noisy Student Pre-training on EfficientNet Architectures for Plant Pathology Classification
Sedrick Scott Keh
2020-12-01
Very Deep VAEs Generalize Autoregressive Models and Can Outperform Them on Images
| Rewon Child
2020-11-20
Combining Self-Training and Self-Supervised Learning for Unsupervised Disfluency Detection
| Shaolei WangZhongyuan WangWanxiang CheTing Liu
2020-10-29
Pushing the Limits of Semi-Supervised Learning for Automatic Speech Recognition
Yu ZhangJames QinDaniel S. ParkWei HanChung-Cheng ChiuRuoming PangQuoc V. LeYonghui Wu
2020-10-20
Noisy Student Training using Body Language Dataset Improves Facial Expression Recognition
Vikas KumarShivansh RaoLi Yu
2020-08-06
Semi-Supervised Learning with Data Augmentation for End-to-End ASR
Felix WeningerFranco ManaRoberto GemelloJesús Andrés-FerrerPuming Zhan
2020-07-27
Uncertainty Quantification in Deep Residual Neural Networks
Lukasz WandzikRaul Vicente GarciaJörg Krüger
2020-07-09
Improved Noisy Student Training for Automatic Speech Recognition
Daniel S. ParkYu ZhangYe JiaWei HanChung-Cheng ChiuBo LiYonghui WuQuoc V. Le
2020-05-19
SpineNet: Learning Scale-Permuted Backbone for Recognition and Localization
| Xianzhi DuTsung-Yi LinPengchong JinGolnaz GhiasiMingxing TanYin CuiQuoc V. LeXiaodan Song
2019-12-10
Self-training with Noisy Student improves ImageNet classification
| Qizhe XieMinh-Thang LuongEduard HovyQuoc V. Le
2019-11-11
Incorporating Biological Knowledge with Factor Graph Neural Network for Interpretable Deep Learning
Tianle MaAidong Zhang
2019-06-03
EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
| Mingxing TanQuoc V. Le
2019-05-28
Beyond Finite Layer Neural Networks: Bridging Deep Architectures and Numerical Differential Equations
Yiping LuAoxiao ZhongQuanzheng LiBin Dong
2017-10-27
Zoneout: Regularizing RNNs by Randomly Preserving Hidden Activations
| David KruegerTegan MaharajJános KramárMohammad PezeshkiNicolas BallasNan Rosemary KeAnirudh GoyalYoshua BengioAaron CourvilleChris Pal
2016-06-03
Swapout: Learning an ensemble of deep architectures
Saurabh SinghDerek HoiemDavid Forsyth
2016-05-20
Deep Networks with Stochastic Depth
| Gao HuangYu SunZhuang LiuDaniel SedraKilian Weinberger
2016-03-30

Components

COMPONENT TYPE
🤖 No Components Found You can add them if they exist; e.g. Mask R-CNN uses RoIAlign

Categories