no code implementations • 20 Mar 2024 • Abhinab Bhattacharjee, Andrey A. Popov, Arash Sarshar, Adrian Sandu
The Adam optimizer, often used in Machine Learning for neural network training, corresponds to an underlying ordinary differential equation (ODE) in the limit of very small learning rates.