no code implementations • 20 Oct 2020 • Shaohuai Shi, Xianhao Zhou, Shutao Song, Xingyao Wang, Zilin Zhu, Xue Huang, Xinan Jiang, Feihu Zhou, Zhenyu Guo, Liqiang Xie, Rui Lan, Xianbin Ouyang, Yan Zhang, Jieqian Wei, Jing Gong, Weiliang Lin, Ping Gao, Peng Meng, Xiaomin Xu, Chenyang Guo, Bo Yang, Zhibo Chen, Yongjian Wu, Xiaowen Chu
Distributed training techniques have been widely deployed in large-scale deep neural networks (DNNs) training on dense-GPU clusters.
no code implementations • 28 Jan 2020 • Xiaoli Liu, Pan Hu, Zhi Mao, Po-Chih Kuo, Peiyao Li, Chao Liu, Jie Hu, Deyu Li, Desen Cao, Roger G. Mark, Leo Anthony Celi, Zhengbo Zhang, Feihu Zhou
This study aims to develop an interpretable and generalizable model for early mortality prediction in elderly patients with MODS.
no code implementations • 10 Sep 2019 • Haidong Rong, Yangzihao Wang, Feihu Zhou, Junjie Zhai, Haiyang Wu, Rui Lan, Fan Li, Han Zhang, Yuekui Yang, Zhenyu Guo, Di Wang
We present Distributed Equivalent Substitution (DES) training, a novel distributed training framework for large-scale recommender systems with dynamic sparse features.
no code implementations • 30 Jul 2018 • Xianyan Jia, Shutao Song, wei he, Yangzihao Wang, Haidong Rong, Feihu Zhou, Liqiang Xie, Zhenyu Guo, Yuanzhou Yang, Liwei Yu, Tiegang Chen, Guangxiao Hu, Shaohuai Shi, Xiaowen Chu
(3) We propose highly optimized all-reduce algorithms that achieve up to 3x and 11x speedup on AlexNet and ResNet-50 respectively than NCCL-based training on a cluster with 1024 Tesla P40 GPUs.