Search Results for author: Yifei Cheng

Found 6 papers, 3 papers with code

Communication-Efficient Distributed Learning with Local Immediate Error Compensation

no code implementations • 19 Feb 2024 • Yifei Cheng, Li Shen, Linli Xu, Xun Qian, Shiwei Wu, Yiming Zhou, Tie Zhang, DaCheng Tao, Enhong Chen

However, existing compression methods either perform only unidirectional compression in one iteration with higher communication cost, or bidirectional compression with slower convergence rate.

Paper
Add Code

DropIT: Dropping Intermediate Tensors for Memory-Efficient DNN Training

1 code implementation • 28 Feb 2022 • Joya Chen, Kai Xu, Yuhui Wang, Yifei Cheng, Angela Yao

A standard hardware bottleneck when training deep neural networks is GPU memory.

Instance Segmentation object-detection +2

Paper
Code

STL-SGD: Speeding Up Local SGD with Stagewise Communication Period

no code implementations • 11 Jun 2020 • Shuheng Shen, Yifei Cheng, Jingchang Liu, Linli Xu

Distributed parallel stochastic gradient descent algorithms are workhorses for large scale machine learning tasks.

Paper
Add Code

Variance Reduced Local SGD with Lower Communication Complexity

1 code implementation • 30 Dec 2019 • Xianfeng Liang, Shuheng Shen, Jingchang Liu, Zhen Pan, Enhong Chen, Yifei Cheng

To accelerate the training of machine learning models, distributed stochastic gradient descent (SGD) and its variants have been widely adopted, which apply multiple workers in parallel to speed up training.

BIG-bench Machine Learning

Paper
Code

Is Heuristic Sampling Necessary in Training Deep Object Detectors?

13 code implementations • 11 Sep 2019 • Joya Chen, Dong Liu, Tong Xu, Shiwei Wu, Yifei Cheng, Enhong Chen

In this paper, we challenge the necessity of such hard/soft sampling methods for training accurate deep object detectors.

General Classification Instance Segmentation +2

9,244

Paper
Code

Faster Distributed Deep Net Training: Computation and Communication Decoupled Stochastic Gradient Descent

no code implementations • 28 Jun 2019 • Shuheng Shen, Linli Xu, Jingchang Liu, Xianfeng Liang, Yifei Cheng

Nevertheless, although distributed stochastic gradient descent (SGD) algorithms can achieve a linear iteration speedup, they are limited significantly in practice by the communication cost, making it difficult to achieve a linear time speedup.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.