Distributed Methods

PipeTransformer is a method for automated elastic pipelining for efficient distributed training of Transformer models. In PipeTransformer, an adaptive on the fly freeze algorithm is used that can identify and freeze some layers gradually during training, as well as an elastic pipelining system that can dynamically allocate resources to train the remaining active layers. More specifically, PipeTransformer automatically excludes frozen layers from the pipeline, packs active layers into fewer GPUs, and forks more replicas to increase data-parallel width.

Source: PipeTransformer: Automated Elastic Pipelining for Distributed Training of Transformers

Papers


Paper Code Results Date Stars

Components


Component Type
🤖 No Components Found You can add them if they exist; e.g. Mask R-CNN uses RoIAlign

Categories