Regularization

LayerDrop is a form of structured dropout for Transformer models which has a regularization effect during training and allows for efficient pruning at inference time. It randomly drops layers from the Transformer according to an "every other" strategy where pruning with a rate $p$ means dropping the layers at depth $d$ such that $d = 0\left(\text{mod}\left(\text{floor}\left(\frac{1}{p}\right)\right)\right)$.

Source: Reducing Transformer Depth on Demand with Structured Dropout

Papers


Paper Code Results Date Stars

Tasks


Components


Component Type
🤖 No Components Found You can add them if they exist; e.g. Mask R-CNN uses RoIAlign

Categories