1 code implementation • NeurIPS 2021 • Jeff Pool, Chong Yu
We introduce channel permutations as a method to maximize the accuracy of N:M sparse networks.
2 code implementations • 16 Apr 2021 • Asit Mishra, Jorge Albericio Latorre, Jeff Pool, Darko Stosic, Dusan Stosic, Ganesh Venkatesh, Chong Yu, Paulius Micikevicius
We present the design and behavior of Sparse Tensor Cores, which exploit a 2:4 (50%) sparsity pattern that leads to twice the math throughput of dense matrix units.
no code implementations • NeurIPS 2020 • Chong Yu, Jeff Pool
Deep learning’s success has led to larger and larger models to handle more and more complex tasks; trained models often contain millions of parameters.
1 code implementation • 3 Jul 2020 • Chong Yu, Jeff Pool
Deep learning's success has led to larger and larger models to handle more and more complex tasks; trained models can contain millions of parameters.
no code implementations • 6 Mar 2019 • Esha Choukse, Michael Sullivan, Mike O'Connor, Mattan Erez, Jeff Pool, David Nellans, Steve Keckler
However, GPU device memory tends to be relatively small and the memory capacity can not be increased by the user.
Hardware Architecture
no code implementations • 1 Jun 2018 • Maohua Zhu, Jason Clemons, Jeff Pool, Minsoo Rhu, Stephen W. Keckler, Yuan Xie
Further, we can enforce structured sparsity in the gate gradients to make the LSTM backward pass up to 45% faster than the state-of-the-art dense approach and 168% faster than the state-of-the-art sparsifying method on modern GPUs.
no code implementations • ICLR 2018 • Feiwen Zhu, Jeff Pool, Michael Andersch, Jeremy Appleyard, Fung Xie
Recurrent Neural Networks (RNNs) are powerful tools for solving sequence-based problems, but their efficacy and execution time are dependent on the size of the network.
1 code implementation • ICLR 2018 • Xingyu Liu, Jeff Pool, Song Han, William J. Dally
First, we move the ReLU operation into the Winograd domain to increase the sparsity of the transformed activations.
no code implementations • 24 May 2017 • Huizi Mao, Song Han, Jeff Pool, Wenshuo Li, Xingyu Liu, Yu Wang, William J. Dally
Since memory reference is more than two orders of magnitude more expensive than arithmetic operations, the regularity of sparse structure leads to more efficient hardware design.
no code implementations • 3 May 2017 • Minsoo Rhu, Mike O'Connor, Niladrish Chatterjee, Jeff Pool, Stephen W. Keckler
Popular deep learning frameworks require users to fine-tune their memory usage so that the training data of a deep neural network (DNN) fits within the GPU physical memory.
2 code implementations • 15 Jul 2016 • Song Han, Jeff Pool, Sharan Narang, Huizi Mao, Enhao Gong, Shijian Tang, Erich Elsen, Peter Vajda, Manohar Paluri, John Tran, Bryan Catanzaro, William J. Dally
We propose DSD, a dense-sparse-dense training flow, for regularizing deep neural networks and achieving better optimization performance.
no code implementations • NeurIPS 2015 • Song Han, Jeff Pool, John Tran, William Dally
On the ImageNet dataset, our method reduced the number of parameters of AlexNet by a factor of 9×, from 61 million to 6. 7 million, without incurring accuracy loss.
7 code implementations • NeurIPS 2015 • Song Han, Jeff Pool, John Tran, William J. Dally
On the ImageNet dataset, our method reduced the number of parameters of AlexNet by a factor of 9x, from 61 million to 6. 7 million, without incurring accuracy loss.