1 code implementation • 13 Oct 2021 • Yujing Ma, Florin Rusu, Kesheng Wu, Alexander Sim
We address these challenges with Adaptive SGD, an adaptive elastic model averaging stochastic gradient descent algorithm for heterogeneous multi-GPUs that is characterized by dynamic scheduling, adaptive batch size scaling, and normalized model merging.
no code implementations • 1 Apr 2021 • Qi Zhao, Yujing Ma, Shuchang Lyu, Lijiang Chen
On this issue, we embed self-distillation (SD) method to transfer knowledge from ensemble network to main-branch in it.
no code implementations • 29 Dec 2020 • Qi Zhao, Shuchang Lyu, Yuewen Li, Yujing Ma, Lijiang Chen
To avoid the interference from confusing information, we propose Multi-granularity Multi-Level Feature Ensemble Module (MGML-FEM) which can provide diverse predictions by full-channel feature generator (FC-FG).
1 code implementation • 19 Apr 2020 • Yujing Ma, Florin Rusu
In order to allow for a principled exploration of the design space, we first introduce a generic deep learning framework that exploits the difference in computational power and memory hierarchy between CPU and GPU through asynchronous message passing.
2 code implementations • 24 Feb 2018 • Yujing Ma, Florin Rusu, Martin Torres
The choice between synchronous GPU and asynchronous CPU depends on the task and the characteristics of the data.