1 code implementation • ICML 2020 • Saurabh Goyal, Anamitra R. Choudhury, Saurabh M. Raje, Venkatesan T. Chakaravarthy, Yogish Sabharwal, Ashish Verma
We demonstrate that our method attains up to 6. 8x reduction in inference time with <1% loss in accuracy when applied over ALBERT, a highly compressed version of BERT.