Search Results for author: Patrick Judd

Found 9 papers, 2 papers with code

FP8 Formats for Deep Learning

2 code implementations • 12 Sep 2022 • Paulius Micikevicius, Dusan Stosic, Neil Burgess, Marius Cornea, Pradeep Dubey, Richard Grisenthwaite, Sangwon Ha, Alexander Heinecke, Patrick Judd, John Kamalu, Naveen Mellempudi, Stuart Oberman, Mohammad Shoeybi, Michael Siu, Hao Wu

FP8 is a natural progression for accelerating deep learning training inference beyond the 16-bit formats common in modern processors.

Quantization

1,438

Paper
Code

Integer Quantization for Deep Learning Inference: Principles and Empirical Evaluation

2 code implementations • 20 Apr 2020 • Hao Wu, Patrick Judd, Xiaojie Zhang, Mikhail Isaev, Paulius Micikevicius

Quantization techniques can reduce the size of Deep Neural Networks and improve inference latency and throughput by taking advantage of high throughput integer instructions.

Math Quantization

125,334

Paper
Code

DPRed: Making Typical Activation and Weight Values Matter In Deep Learning Computing

no code implementations • 17 Apr 2018 • Alberto Delmas, Sayeh Sharify, Patrick Judd, Kevin Siu, Milos Nikolic, Andreas Moshovos

The per group precisions are selected statically for the weights and dynamically by hardware for the activations.

Paper
Add Code

Bit-Tactical: Exploiting Ineffectual Computations in Convolutional Neural Networks: Which, Why, and How

no code implementations • 9 Mar 2018 • Alberto Delmas, Patrick Judd, Dylan Malone Stuart, Zissis Poulos, Mostafa Mahmoud, Sayeh Sharify, Milos Nikolic, Andreas Moshovos

We show that, during inference with Convolutional Neural Networks (CNNs), more than 2x to $8x ineffectual work can be exposed if instead of targeting those weights and activations that are zero, we target different combinations of value stream properties.

Paper
Add Code

Tartan: Accelerating Fully-Connected and Convolutional Layers in Deep Learning Networks by Exploiting Numerical Precision Variability

no code implementations • 27 Jul 2017 • Alberto Delmas, Sayeh Sharify, Patrick Judd, Andreas Moshovos

Experiments on image classification CNNs show that on average across all networks studied, TRT outperforms a state-of-the-art bit-parallel accelerator by 1:90x without any loss in accuracy while it is 1:17x more energy efficient.

Image Classification

Paper
Add Code

Loom: Exploiting Weight and Activation Precisions to Accelerate Convolutional Neural Networks

no code implementations • 23 Jun 2017 • Sayeh Sharify, Alberto Delmas Lascorz, Kevin Siu, Patrick Judd, Andreas Moshovos

LM can trade-off accuracy for additional improvements in execution performance and energy efficiency and compares favorably to an accelerator that targeted only activation precisions.

Image Classification

Paper
Add Code

Dynamic Stripes: Exploiting the Dynamic Precision Requirements of Activation Values in Neural Networks

no code implementations • 1 Jun 2017 • Alberto Delmas, Patrick Judd, Sayeh Sharify, Andreas Moshovos

Stripes is a Deep Neural Network (DNN) accelerator that uses bit-serial computation to offer performance that is proportional to the fixed-point precision of the activation values.

Paper
Add Code

Cnvlutin2: Ineffectual-Activation-and-Weight-Free Deep Neural Network Computing

no code implementations • 29 Apr 2017 • Patrick Judd, Alberto Delmas, Sayeh Sharify, Andreas Moshovos

We also present a modified organization that detects the activations that are deemed as ineffectual while fetching them from memory.

Paper
Add Code

Reduced-Precision Strategies for Bounded Memory in Deep Neural Nets

no code implementations • 17 Nov 2015 • Patrick Judd, Jorge Albericio, Tayler Hetherington, Tor Aamodt, Natalie Enright Jerger, Raquel Urtasun, Andreas Moshovos

A diverse set of CNNs is analyzed showing that compared to a conventional implementation using a 32-bit floating-point representation for all layers, and with less than 1% loss in relative accuracy, the data footprint required by these networks can be reduced by an average of 74% and up to 92%.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.