LayerDrop

Introduced by Fan et al. in Reducing Transformer Depth on Demand with Structured Dropout

LayerDrop is a form of structured dropout for Transformer models which has a regularization effect during training and allows for efficient pruning at inference time. It randomly drops layers from the Transformer according to an "every other" strategy where pruning with a rate $p$ means dropping the layers at depth $d$ such that $d = 0\left(\text{mod}\left(\text{floor}\left(\frac{1}{p}\right)\right)\right)$.

Source: Reducing Transformer Depth on Demand with Structured Dropout

Read Paper See Code

Papers

Paper	Code	Results	Date	Stars

Tasks

Task	Papers	Share
Language Modelling	2	18.18%
Machine Translation	2	18.18%
Translation	2	18.18%
Cross-Modal Retrieval	1	9.09%
Retrieval	1	9.09%
Multi-Task Learning	1	9.09%
Open-Domain Question Answering	1	9.09%
Question Answering	1	9.09%

Usage Over Time

This feature is experimental; we are continuously improving our matching algorithm.

Components

Component	Type	Add Remove
🤖 No Components Found	You can add them if they exist; e.g. Mask R-CNN uses RoIAlign

Categories

Add Remove

Regularization