Search Results for author: Artin Spiridonoff

Found 3 papers, 1 papers with code

Communication-efficient SGD: From Local SGD to One-Shot Averaging

no code implementations • NeurIPS 2021 • Artin Spiridonoff, Alex Olshevsky, Ioannis Ch. Paschalidis

While it is possible to obtain a linear reduction in the variance by averaging all the stochastic gradients at every step, this requires a lot of communication between the workers and the server, which can dramatically reduce the gains from parallelism.

Paper
Add Code

Local SGD With a Communication Overhead Depending Only on the Number of Workers

no code implementations • 3 Jun 2020 • Artin Spiridonoff, Alex Olshevsky, Ioannis Ch. Paschalidis

While the initial analysis of Local SGD showed it needs $\Omega ( \sqrt{T} )$ communications for $T$ local gradient steps in order for the error to scale proportionately to $1/(nT)$, this has been successively improved in a string of papers, with the state-of-the-art requiring $\Omega \left( n \left( \mbox{ polynomial in log } (T) \right) \right)$ communications.

Paper
Add Code

Robust Asynchronous Stochastic Gradient-Push: Asymptotically Optimal and Network-Independent Performance for Strongly Convex Functions

1 code implementation • 9 Nov 2018 • Artin Spiridonoff, Alex Olshevsky, Ioannis Ch. Paschalidis

We consider the standard model of distributed optimization of a sum of functions $F(\bz) = \sum_{i=1}^n f_i(\bz)$, where node $i$ in a network holds the function $f_i(\bz)$.

Optimization and Control

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.