Search Results for author: Ella Charlaix

KroneckerBERT: Significant Compression of Pre-trained Language Models Through Kronecker Decomposition and Knowledge Distillation

We push the limits of state-of-the-art Transformer-based pre-trained language model compression using Kronecker decomposition.

Paper
Add Code

We present our KroneckerBERT, a compressed version of the BERT_BASE model obtained using this framework.

Paper
Add Code

Pre-training has improved model accuracy for both classification and generation tasks at the cost of introducing much larger and slower models.

378

Paper
Code

State-of-the-art neural machine translation methods employ massive amounts of parameters.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.