1 code implementation • 22 Oct 2020 • Minho Ryu, Kichun Lee
A pre-trained language model, BERT, has brought significant performance improvements across a range of natural language processing tasks.
General Classification Knowledge Distillation +4