no code implementations • 30 Jun 2020 • Nandan Kumar Jha, Rajat Saini, Sparsh Mittal
Surprisingly, in some cases, they surpass the accuracy of baseline networks even with the inferior teachers.
no code implementations • 26 Jun 2020 • Nandan Kumar Jha, Rajat Saini, Subhrajit Nag, Sparsh Mittal
We show that, at comparable computational complexity, DNNs with constant group size (E2GC) are more energy-efficient than DNNs with a fixed number of groups (F$g$GC).
1 code implementation • 26 Jun 2020 • Rajat Saini, Nandan Kumar Jha, Bedanta Das, Sparsh Mittal, C. Krishna Mohan
Our method of subspace attention is orthogonal and complementary to the existing state-of-the-arts attention mechanisms used in vision models.