Search Results for author: Abhisek Kundu

Found 13 papers, 2 papers with code

Harnessing Deep Learning and HPC Kernels via High-Level Loop and Tensor Abstractions on CPU Architectures

1 code implementation • 25 Apr 2023 • Evangelos Georganas, Dhiraj Kalamkar, Kirill Voronin, Abhisek Kundu, Antonio Noack, Hans Pabst, Alexander Breuer, Alexander Heinecke

During the past decade, Deep Learning (DL) algorithms, programming systems and hardware have converged with the High Performance Computing (HPC) counterparts.

Paper
Code

AUTOSPARSE: Towards Automated Sparse Training of Deep Neural Networks

no code implementations • 14 Apr 2023 • Abhisek Kundu, Naveen K. Mellempudi, Dharma Teja Vooturi, Bharat Kaul, Pradeep Dubey

We integrated GA with the latest learnable pruning methods to create an automated sparse training algorithm called AutoSparse, which achieves better accuracy and/or training/inference FLOPS reduction than existing learnable pruning methods for sparse ResNet50 and MobileNetV1 on ImageNet-1K: AutoSparse achieves (2x, 7x) reduction in (training, inference) FLOPS for ResNet50 on ImageNet at 80% sparsity.

Paper
Add Code

Tensor Processing Primitives: A Programming Abstraction for Efficiency and Portability in Deep Learning & HPC Workloads

3 code implementations • 12 Apr 2021 • Evangelos Georganas, Dhiraj Kalamkar, Sasikanth Avancha, Menachem Adelman, Deepti Aggarwal, Cristina Anderson, Alexander Breuer, Jeremy Bruestle, Narendra Chaudhary, Abhisek Kundu, Denise Kutnick, Frank Laub, Vasimuddin Md, Sanchit Misra, Ramanarayan Mohanty, Hans Pabst, Brian Retford, Barukh Ziv, Alexander Heinecke

The TPP specification is platform-agnostic, thus code expressed via TPPs is portable, whereas the TPP implementation is highly-optimized and platform-specific.

810

Paper
Code

K-TanH: Efficient TanH For Deep Learning

no code implementations • 17 Sep 2019 • Abhisek Kundu, Alex Heinecke, Dhiraj Kalamkar, Sudarshan Srinivasan, Eric C. Qin, Naveen K. Mellempudi, Dipankar Das, Kunal Banerjee, Bharat Kaul, Pradeep Dubey

We propose K-TanH, a novel, highly accurate, hardware efficient approximation of popular activation function TanH for Deep Learning.

Translation

Paper
Add Code

A Study of BFLOAT16 for Deep Learning Training

no code implementations • 29 May 2019 • Dhiraj Kalamkar, Dheevatsa Mudigere, Naveen Mellempudi, Dipankar Das, Kunal Banerjee, Sasikanth Avancha, Dharma Teja Vooturi, Nataraj Jammalamadaka, Jianyu Huang, Hector Yuen, Jiyan Yang, Jongsoo Park, Alexander Heinecke, Evangelos Georganas, Sudarshan Srinivasan, Abhisek Kundu, Misha Smelyanskiy, Bharat Kaul, Pradeep Dubey

In this paper, we discuss the flow of tensors and various key operations in mixed precision training, and delve into details of operations, such as the rounding modes for converting FP32 tensors to BFLOAT16.

Image Classification Language Modelling +3

Paper
Add Code

Ternary Residual Networks

no code implementations • 15 Jul 2017 • Abhisek Kundu, Kunal Banerjee, Naveen Mellempudi, Dheevatsa Mudigere, Dipankar Das, Bharat Kaul, Pradeep Dubey

Aided by such an elegant trade-off between accuracy and compute, the 8-2 model (8-bit activations, ternary weights), enhanced by ternary residual edges, turns out to be sophisticated enough to achieve very high accuracy ($\sim 1\%$ drop from our FP-32 baseline), despite $\sim 1. 6\times$ reduction in model size, $\sim 26\times$ reduction in number of multiplications, and potentially $\sim 2\times$ power-performance gain comparing to 8-8 representation, on the state-of-the-art deep network ResNet-101 pre-trained on ImageNet dataset.

Paper
Add Code

Ternary Neural Networks with Fine-Grained Quantization

no code implementations • 2 May 2017 • Naveen Mellempudi, Abhisek Kundu, Dheevatsa Mudigere, Dipankar Das, Bharat Kaul, Pradeep Dubey

We address this by fine-tuning Resnet-50 with 8-bit activations and ternary weights at $N=64$, improving the Top-1 accuracy to within $4\%$ of the full precision result with $<30\%$ additional training overhead.

Quantization

Paper
Add Code

Mixed Low-precision Deep Learning Inference using Dynamic Fixed Point

no code implementations • 31 Jan 2017 • Naveen Mellempudi, Abhisek Kundu, Dipankar Das, Dheevatsa Mudigere, Bharat Kaul

We propose a cluster-based quantization method to convert pre-trained full precision weights into ternary weights with minimal impact on the accuracy.

Quantization

Paper
Add Code

A Randomized Rounding Algorithm for Sparse PCA

no code implementations • 13 Aug 2015 • Kimon Fountoulakis, Abhisek Kundu, Eugenia-Maria Kontopoulou, Petros Drineas

We present and analyze a simple, two-step algorithm to approximate the optimal solution of the sparse PCA problem.

Paper
Add Code

Relaxed Leverage Sampling for Low-rank Matrix Completion

no code implementations • 22 Mar 2015 • Abhisek Kundu

In such settings also we can achieve improvement on sample size.

Low-Rank Matrix Completion

Paper
Add Code

Approximating Sparse PCA from Incomplete Data

no code implementations • NeurIPS 2015 • Abhisek Kundu, Petros Drineas, Malik Magdon-Ismail

We show that for a wide class of optimization problems, if the sketch is close (in the spectral norm) to the original data matrix, then one can recover a near optimal solution to the optimization problem by using the sketch.

Math

Paper
Add Code

Recovering PCA from Hybrid-$(\ell_1,\ell_2)$ Sparse Sampling of Data Elements

no code implementations • 2 Mar 2015 • Abhisek Kundu, Petros Drineas, Malik Magdon-Ismail

This paper addresses how well we can recover a data matrix when only given a few of its elements.

Paper
Add Code

Identifying Influential Entries in a Matrix

no code implementations • 14 Oct 2013 • Abhisek Kundu, Srinivas Nambirajan, Petros Drineas

For any matrix A in R^(m x n) of rank \rho, we present a probability distribution over the entries of A (the element-wise leverage scores of equation (2)) that reveals the most influential entries in the matrix.

Matrix Completion

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.