Search Results for author: Artin Spiridonoff

Found 3 papers, 1 papers with code

Communication-efficient SGD: From Local SGD to One-Shot Averaging

no code implementations NeurIPS 2021 Artin Spiridonoff, Alex Olshevsky, Ioannis Ch. Paschalidis

While it is possible to obtain a linear reduction in the variance by averaging all the stochastic gradients at every step, this requires a lot of communication between the workers and the server, which can dramatically reduce the gains from parallelism.

Local SGD With a Communication Overhead Depending Only on the Number of Workers

no code implementations3 Jun 2020 Artin Spiridonoff, Alex Olshevsky, Ioannis Ch. Paschalidis

While the initial analysis of Local SGD showed it needs $\Omega ( \sqrt{T} )$ communications for $T$ local gradient steps in order for the error to scale proportionately to $1/(nT)$, this has been successively improved in a string of papers, with the state-of-the-art requiring $\Omega \left( n \left( \mbox{ polynomial in log } (T) \right) \right)$ communications.

Robust Asynchronous Stochastic Gradient-Push: Asymptotically Optimal and Network-Independent Performance for Strongly Convex Functions

1 code implementation9 Nov 2018 Artin Spiridonoff, Alex Olshevsky, Ioannis Ch. Paschalidis

We consider the standard model of distributed optimization of a sum of functions $F(\bz) = \sum_{i=1}^n f_i(\bz)$, where node $i$ in a network holds the function $f_i(\bz)$.

Optimization and Control

Cannot find the paper you are looking for? You can Submit a new open access paper.