Exploring the Limits of Simple Learners in Knowledge Distillation for Document Classification with DocBERT

WS 2020 Ashutosh AdhikariAchyudh RamRaphael TangWilliam L. HamiltonJimmy Lin

Fine-tuned variants of BERT are able to achieve state-of-the-art accuracy on many natural language processing tasks, although at significant computational costs. In this paper, we verify BERT{'}s effectiveness for document classification and investigate the extent to which BERT-level effectiveness can be obtained by different baselines, combined with knowledge distillation{---}a popular model compression method... (read more)

PDF Abstract


No code implementations yet. Submit your code now

Results from the Paper

  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods used in the Paper