Communication-Efficient Local Decentralized SGD Methods

21 Oct 2019  ·  Xiang Li, Wenhao Yang, Shusen Wang, Zhihua Zhang ·

Recently, the technique of local updates is a powerful tool in centralized settings to improve communication efficiency via periodical communication. For decentralized settings, it is still unclear how to efficiently combine local updates and decentralized communication. In this work, we propose an algorithm named as LD-SGD, which incorporates arbitrary update schemes that alternate between multiple Local updates and multiple Decentralized SGDs, and provide an analytical framework for LD-SGD. Under the framework, we present a sufficient condition to guarantee the convergence. We show that LD-SGD converges to a critical point for a wide range of update schemes when the objective is non-convex and the training data are non-identically independent distributed. Moreover, our framework brings many insights into the design of update schemes for decentralized optimization. As examples, we specify two update schemes and show how they help improve communication efficiency. Specifically, the first scheme alternates the number of local and global update steps. From our analysis, the ratio of the number of local updates to that of decentralized SGD trades off communication and computation. The second scheme is to periodically shrink the length of local updates. We show that the decaying strategy helps improve communication efficiency both theoretically and empirically.

PDF Abstract
No code implementations yet. Submit your code now

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods