no code implementations • 17 Jan 2024 • Daniele De Sensi, Tommaso Bonato, David Saam, Torsten Hoefler
The allreduce collective operation accounts for a significant fraction of the runtime of workloads running on distributed systems.
no code implementations • 3 Sep 2022 • Torsten Hoefler, Tommaso Bonato, Daniele De Sensi, Salvatore Di Girolamo, Shigang Li, Marco Heddes, Jon Belk, Deepak Goel, Miguel Castro, Steve Scott
Numerous microarchitectural optimizations unlocked tremendous processing power for deep neural networks that in turn fueled the AI revolution.