Transformers

BinaryBERT

Introduced by Bai et al. in BinaryBERT: Pushing the Limit of BERT Quantization

BinaryBERT is a BERT-variant that applies quantization in the form of weight binarization. Specifically, ternary weight splitting is proposed which initializes BinaryBERT by equivalently splitting from a half-sized ternary network. To obtain BinaryBERT, we first train a half-sized ternary BERT model, and then apply a ternary weight splitting operator to obtain the latent full-precision and quantized weights as the initialization of the full-sized BinaryBERT. We then fine-tune BinaryBERT for further refinement.

Source: BinaryBERT: Pushing the Limit of BERT Quantization

Papers


Paper Code Results Date Stars

Tasks


Task Papers Share
Model Compression 1 50.00%
Quantization 1 50.00%

Categories