1 code implementation • 13 Mar 2022 • Bapi Chatterjee, Vyacheslav Kungurtsev, Dan Alistarh
Our scheme is based on the following algorithmic tools and features: (a) asynchronous local gradient updates on the shared-memory of workers, (b) partial backpropagation, and (c) non-blocking in-place averaging of the local models.
no code implementations • 1 Jan 2021 • Bapi Chatterjee, Vyacheslav Kungurtsev, Dan Alistarh
On the theoretical side, we show that this method guarantees ergodic convergence for non-convex objectives, and achieves the classic sublinear rate under standard assumptions.
no code implementations • 12 Jun 2020 • Vyacheslav Kungurtsev, Bapi Chatterjee, Dan Alistarh
Stochastic Gradient Langevin Dynamics (SGLD) ensures strong guarantees with regards to convergence in measure for sampling log-concave posterior distributions by adding noise to stochastic gradient iterates.
no code implementations • 16 Jan 2020 • Giorgi Nadiradze, Ilia Markov, Bapi Chatterjee, Vyacheslav Kungurtsev, Dan Alistarh
Our framework, called elastic consistency enables us to derive convergence bounds for a variety of distributed SGD methods used in practice to train large-scale machine learning models.
no code implementations • 25 Sep 2019 • Vyacheslav Kungurtsev, Malcolm Egan, Bapi Chatterjee, Dan Alistarh
This is all the more surprising since these objectives are the ones appearing in the training of deep neural networks.