no code implementations • ICLR 2019 • Jeffrey L. McKinstry, Steven K. Esser, Rathinakumar Appuswamy, Deepika Bablani, John V. Arthur, Izzet B. Yildiz, Dharmendra S. Modha
Therefore, we (a) reduce solution distance by starting with pretrained fp32 precision baseline networks and fine-tuning, and (b) combat gradient noise introduced by quantization by training longer and reducing learning rates.