no code implementations • 25 Dec 2022 • Ibtihel Amara, Nazanin Sepahvand, Brett H. Meyer, Warren J. Gross, James J. Clark
We show that adaptively balancing between the reverse and forward divergences shifts the focus of the training strategy to the compact student network without limiting the teacher network's learning process.