Compressing Neural Machine Translation Models with 4-bit Precision

WS 2020 Alham Fikri AjiKenneth Heafield

Neural Machine Translation (NMT) is resource-intensive. We design a quantization procedure to compress fit NMT models better for devices with limited hardware capability... (read more)

PDF Abstract


No code implementations yet. Submit your code now

Results from the Paper

  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods used in the Paper