1 code implementation • 10 Jun 2022 • Zhiquan Lai, Shengwei Li, Xudong Tang, Keshi Ge, Weijie Liu, Yabo Duan, Linbo Qiao, Dongsheng Li
These features make it necessary to apply 3D parallelism, which integrates data parallelism, pipeline model parallelism and tensor model parallelism, to achieve high training efficiency.
1 code implementation • 30 Mar 2022 • Yu Tang, Chenyu Wang, Yufan Zhang, Yuliang Liu, Xingcheng Zhang, Linbo Qiao, Zhiquan Lai, Dongsheng Li
To the best of our knowledge, we are the first to make a reasonable dynamic runtime scheduler on the combination of tensor swapping and tensor recomputation without user oversight.
no code implementations • 18 Oct 2021 • Shengwei Li, Zhiquan Lai, Dongsheng Li, Yiming Zhang, Xiangyu Ye, Yabo Duan
EmbRace introduces Sparsity-aware Hybrid Communication, which integrates AlltoAll and model parallelism into data-parallel training, so as to reduce the communication overhead of highly sparse parameters.
no code implementations • 5 Oct 2021 • Keshi Ge, Yongquan Fu, Zhiquan Lai, Xiaoge Deng, Dongsheng Li
Distributed stochastic gradient descent (SGD) approach has been widely used in large-scale deep learning, and the gradient collective method is vital to ensure the training scalability of the distributed deep learning system.
no code implementations • 13 Apr 2021 • Ning Liu, Songlei Jian, Dongsheng Li, Yiming Zhang, Zhiquan Lai, Hongzuo Xu
Graph neural networks (GNN) have been proven to be mature enough for handling graph-structured data on node-level graph representation learning tasks.
1 code implementation • 10 Jun 2020 • Yu Tang, Zhigang Kan, Dequan Sun, Jingjing Xiao, Zhiquan Lai, Linbo Qiao, Dongsheng Li
We also provide novel update rules and theoretical convergence analysis.