1 code implementation • 6 Oct 2023 • Naren Dhyani, Jianqiao Mo, Minsu Cho, Ameya Joshi, Siddharth Garg, Brandon Reagen, Chinmay Hegde
The Vision Transformer (ViT) architecture has emerged as the backbone of choice for state-of-the-art deep models for computer vision applications.
no code implementations • 9 Jul 2023 • Jianqiao Mo, Karthik Garimella, Negar Neda, Austin Ebel, Brandon Reagen
The characterization motivates the need for both GCs and HE accelerators.
1 code implementation • 4 Feb 2022 • Minsu Cho, Ameya Joshi, Siddharth Garg, Brandon Reagen, Chinmay Hegde
To reduce PI latency we propose a gradient-based algorithm that selectively linearizes ReLUs while maintaining prediction accuracy.
2 code implementations • 26 Jul 2021 • Karthik Garimella, Nandan Kumar Jha, Brandon Reagen
In this work, we ask: Is it feasible to substitute all ReLUs with low-degree polynomial activation functions for building deep, privacy-friendly neural networks?
no code implementations • 17 Jun 2021 • Minsu Cho, Zahra Ghodsi, Brandon Reagen, Siddharth Garg, Chinmay Hegde
The emergence of deep learning has been accompanied by privacy concerns surrounding users' data and service providers' models.
no code implementations • NeurIPS 2021 • Zahra Ghodsi, Nandan Kumar Jha, Brandon Reagen, Siddharth Garg
In this paper we re-think the ReLU computation and propose optimizations for PI tailored to properties of neural networks.
no code implementations • 9 May 2021 • Deeksha Dangwal, Vincent T. Lee, Hyo Jin Kim, Tianwei Shen, Meghan Cowan, Rajvi Shah, Caroline Trippel, Brandon Reagen, Timothy Sherwood, Vasileios Balntas, Armin Alaghi, Eddy Ilg
This poses a potential risk to user privacy.
no code implementations • 2 Mar 2021 • Nandan Kumar Jha, Zahra Ghodsi, Siddharth Garg, Brandon Reagen
This paper proposes DeepReDuce: a set of optimizations for the judicious removal of ReLUs to reduce private inference latency.
no code implementations • NeurIPS 2020 • Zahra Ghodsi, Akshaj Veldanda, Brandon Reagen, Siddharth Garg
Machine learning as a service has given raise to privacy concerns surrounding clients' data and providers' models and has catalyzed research in private inference (PI): methods to process inferences without disclosing inputs.
no code implementations • 8 Jan 2020 • Udit Gupta, Samuel Hsia, Vikram Saraph, Xiaodong Wang, Brandon Reagen, Gu-Yeon Wei, Hsien-Hsin S. Lee, David Brooks, Carole-Jean Wu
Neural personalized recommendation is the corner-stone of a wide collection of cloud services and products, constituting significant compute demand of the cloud infrastructure.
Distributed, Parallel, and Cluster Computing
no code implementations • 23 Aug 2019 • Udit Gupta, Brandon Reagen, Lillian Pentecost, Marco Donato, Thierry Tambe, Alexander M. Rush, Gu-Yeon Wei, David Brooks
The architecture is enhanced by a series of dynamic activation optimizations that enable compact storage, ensure no energy is wasted computing null operations, and maintain high MAC utilization for highly parallel accelerator designs.
7 code implementations • 6 Jun 2019 • Udit Gupta, Carole-Jean Wu, Xiaodong Wang, Maxim Naumov, Brandon Reagen, David Brooks, Bradford Cottel, Kim Hazelwood, Bill Jia, Hsien-Hsin S. Lee, Andrey Malevich, Dheevatsa Mudigere, Mikhail Smelyanskiy, Liang Xiong, Xuan Zhang
The widespread application of deep learning has changed the landscape of computation in the data center.
2 code implementations • 13 Nov 2017 • Brandon Reagen, Udit Gupta, Robert Adolf, Michael M. Mitzenmacher, Alexander M. Rush, Gu-Yeon Wei, David Brooks
This results in up to a 1. 51x improvement over the state-of-the-art.
1 code implementation • 23 Aug 2016 • Robert Adolf, Saketh Rama, Brandon Reagen, Gu-Yeon Wei, David Brooks
Fathom has been released online, and this paper focuses on understanding the fundamental performance characteristics of each model.