1 code implementation • 27 Mar 2019 • Yusuke Nagasaka, Akira Nukada, Ryosuke Kojima, Satoshi Matsuoka
We evaluated the performance of the GCNs application on TSUBAME3. 0 implementing NVIDIA Tesla P100 GPU, and our batched approach shows significant speedups of up to 1. 59x and 1. 37x in training and inference, respectively.
Distributed, Parallel, and Cluster Computing