Search Results for author: Sheng-Chun Kao

Found 13 papers, 6 papers with code

NonGEMM Bench: Understanding the Performance Horizon of the Latest ML Workloads with NonGEMM Workloads

no code implementations17 Apr 2024 Rachid Karami, Hemanth Kota, Sheng-Chun Kao, Hyoukjun Kwon

Therefore, significant effort has been put to study and optimize the GEMM operators in order to speed up the execution of ML models.

Progressive Gradient Flow for Robust N:M Sparsity Training in Transformers

1 code implementation7 Feb 2024 Abhimanyu Rajeshkumar Bambhaniya, Amir Yazdanbakhsh, Suvinay Subramanian, Sheng-Chun Kao, Shivani Agrawal, Utku Evci, Tushar Krishna

In this work, we study the effectiveness of existing sparse training recipes at \textit{high-sparsity regions} and argue that these methods fail to sustain the model quality on par with low-sparsity regions.

Demystifying Map Space Exploration for NPUs

1 code implementation7 Oct 2022 Sheng-Chun Kao, Angshuman Parashar, Po-An Tsai, Tushar Krishna

Map Space Exploration is the problem of finding optimized mappings of a Deep Neural Network (DNN) model on an accelerator.

Navigate Neural Architecture Search

Training Recipe for N:M Structured Sparsity with Decaying Pruning Mask

no code implementations15 Sep 2022 Sheng-Chun Kao, Amir Yazdanbakhsh, Suvinay Subramanian, Shivani Agrawal, Utku Evci, Tushar Krishna

In this work, we focus on N:M sparsity and extensively study and evaluate various training recipes for N:M sparsity in terms of the trade-off between model accuracy and compute cost (FLOPs).

DiGamma: Domain-aware Genetic Algorithm for HW-Mapping Co-optimization for DNN Accelerators

2 code implementations26 Jan 2022 Sheng-Chun Kao, Michael Pellauer, Angshuman Parashar, Tushar Krishna

The design of DNN accelerators includes two key parts: HW resource configuration and mapping strategy.

MAGMA: An Optimization Framework for Mapping Multiple DNNs on Multiple Accelerator Cores

no code implementations28 Apr 2021 Sheng-Chun Kao, Tushar Krishna

In particular, we focus on the problem of mapping jobs from several DNNs simultaneously on an accelerator.

Efficient Exploration

Conditional Neural Architecture Search

no code implementations6 Jun 2020 Sheng-Chun Kao, Arun Ramamurthy, Reed Williams, Tushar Krishna

Designing resource-efficient Deep Neural Networks (DNNs) is critical to deploy deep learning solutions over edge platforms due to diverse performance, power, and memory budgets.

Neural Architecture Search

Generative Design of Hardware-aware DNNs

no code implementations6 Jun 2020 Sheng-Chun Kao, Arun Ramamurthy, Tushar Krishna

We propose a new way for autonomous quantization and HW-aware tuning.

Quantization

Cannot find the paper you are looking for? You can Submit a new open access paper.