1 code implementation • 21 Mar 2021 • Yuanxin Liu, Zheng Lin, Fengcheng Yuan
Based on the empirical findings, our best compressed model, dubbed Refined BERT cOmpreSsion with InTegrAted techniques (ROSITA), is $7. 5 \times$ smaller than BERT while maintains $98. 5\%$ of the performance on five tasks of the GLUE benchmark, outperforming the previous BERT compression methods with similar parameter budget.