no code implementations • 19 Feb 2024 • Yifei Cheng, Li Shen, Linli Xu, Xun Qian, Shiwei Wu, Yiming Zhou, Tie Zhang, DaCheng Tao, Enhong Chen
However, existing compression methods either perform only unidirectional compression in one iteration with higher communication cost, or bidirectional compression with slower convergence rate.
1 code implementation • 28 Feb 2022 • Joya Chen, Kai Xu, Yuhui Wang, Yifei Cheng, Angela Yao
A standard hardware bottleneck when training deep neural networks is GPU memory.
no code implementations • 11 Jun 2020 • Shuheng Shen, Yifei Cheng, Jingchang Liu, Linli Xu
Distributed parallel stochastic gradient descent algorithms are workhorses for large scale machine learning tasks.
1 code implementation • 30 Dec 2019 • Xianfeng Liang, Shuheng Shen, Jingchang Liu, Zhen Pan, Enhong Chen, Yifei Cheng
To accelerate the training of machine learning models, distributed stochastic gradient descent (SGD) and its variants have been widely adopted, which apply multiple workers in parallel to speed up training.
13 code implementations • 11 Sep 2019 • Joya Chen, Dong Liu, Tong Xu, Shiwei Wu, Yifei Cheng, Enhong Chen
In this paper, we challenge the necessity of such hard/soft sampling methods for training accurate deep object detectors.
no code implementations • 28 Jun 2019 • Shuheng Shen, Linli Xu, Jingchang Liu, Xianfeng Liang, Yifei Cheng
Nevertheless, although distributed stochastic gradient descent (SGD) algorithms can achieve a linear iteration speedup, they are limited significantly in practice by the communication cost, making it difficult to achieve a linear time speedup.