1 code implementation • 16 Nov 2023 • Hanpeng Hu, Junwei Su, Juntao Zhao, Yanghua Peng, Yibo Zhu, Haibin Lin, Chuan Wu
Considering the large space of DNN models and devices that impede direct profiling of all combinations, recent efforts focus on building a predictor to model the performance of DNN models on different devices.
1 code implementation • 6 Oct 2022 • Yujia Zhai, Chengquan Jiang, Leyuan Wang, Xiaoying Jia, Shang Zhang, Zizhong Chen, Xin Liu, Yibo Zhu
In this paper, we present ByteTransformer, a high-performance transformer boosted for variable-length inputs.
no code implementations • 28 May 2022 • Zhuang Wang, Haibin Lin, Yibo Zhu, T. S. Eugene Ng
It first designs a decision tree abstraction to express all the compression strategies and develops empirical models to timeline tensor computation, communication, and compression to enable ByteComp to derive the intricate interactions among tensors.
no code implementations • 5 May 2022 • Hanpeng Hu, Chenyu Jiang, Yuchen Zhong, Yanghua Peng, Chuan Wu, Yibo Zhu, Haibin Lin, Chuanxiong Guo
Distributed training using multiple devices (e. g., GPUs) has been widely adopted for learning DNN models over large datasets.
no code implementations • 16 Feb 2022 • Jiamin Li, Hong Xu, Yibo Zhu, Zherui Liu, Chuanxiong Guo, Cong Wang
We introduce Aryl, a new cluster scheduler to address these problems.
no code implementations • 16 Dec 2021 • Tianfeng Liu, Yangrui Chen, Dan Li, Chuan Wu, Yibo Zhu, Jun He, Yanghua Peng, Hongzheng Chen, Hongzhi Chen, Chuanxiong Guo
Extensive experiments on various GNN models and large graph datasets show that BGL significantly outperforms existing GNN training systems by 20. 68x on average.
no code implementations • 25 Oct 2021 • Jiarong Xing, Leyuan Wang, Shang Zhang, Jack Chen, Ang Chen, Yibo Zhu
Today's auto-tuners (e. g., AutoTVM, Ansor) generate efficient tensor programs by navigating a large search space to identify effective implementations, but they do so with opaque hardware details.
no code implementations • 18 Sep 2021 • Cheng Tan, Zhichao Li, Jian Zhang, Yu Cao, Sikai Qi, Zherui Liu, Yibo Zhu, Chuanxiong Guo
With MIG, A100 can be the most cost-efficient GPU ever for serving Deep Neural Networks (DNNs).
1 code implementation • ICLR 2021 • Yuchen Jin, Tianyi Zhou, Liangyu Zhao, Yibo Zhu, Chuanxiong Guo, Marco Canini, Arvind Krishnamurthy
This mutual-training process between BO and the loss-prediction model allows us to limit the training steps invested in the BO search.
no code implementations • 2 Jul 2020 • Brian S. Lee, Bumho Kim, Alexandre P. Freitas, Aseema Mohanty, Yibo Zhu, Gaurang R. Bhatt, James Hone, Michal Lipson
High performance integrated electro-optic modulators operating at low temperature are critical for optical interconnects in cryogenic applications.
Applied Physics Optics