no code implementations • 18 Feb 2023 • Atsushi Nitanda, Ryuhei Kikuchi, Shugo Maeda
Stochastic gradient descent is a workhorse for training deep neural networks due to its excellent generalization performance.