3 code implementations • 3 Oct 2018 • Thorsten Kurth, Sean Treichler, Joshua Romero, Mayur Mudigonda, Nathan Luehr, Everett Phillips, Ankur Mahesh, Michael Matheson, Jack Deslippe, Massimiliano Fatica, Prabhat, Michael Houston
The Tiramisu network scales to 5300 P100 GPUs with a sustained throughput of 21. 0 PF/s and parallel efficiency of 79. 0%.
Distributed, Parallel, and Cluster Computing