Search Results for author: Sreenivas Subramoney

Found 9 papers, 3 papers with code

CiMNet: Towards Joint Optimization for DNN Architecture and Configuration for Compute-In-Memory Hardware

no code implementations19 Feb 2024 Souvik Kundu, Anthony Sarah, Vinay Joshi, Om J Omer, Sreenivas Subramoney

With the recent growth in demand for large-scale deep neural networks, compute in-memory (CiM) has come up as a prominent solution to alleviate bandwidth and on-chip interconnect bottlenecks that constrain Von-Neuman architectures.

Neural Architecture Search

Reclaimer: A Reinforcement Learning Approach to Dynamic Resource Allocation for Cloud Microservices

no code implementations17 Apr 2023 Quintin Fettes, Avinash Karanth, Razvan Bunescu, Brandon Beckwith, Sreenivas Subramoney

Many cloud applications are migrated from the monolithic model to a microservices framework in which hundreds of loosely-coupled microservices run concurrently, with significant benefits in terms of scalability, rapid development, modularity, and isolation.

reinforcement-learning

VEGETA: Vertically-Integrated Extensions for Sparse/Dense GEMM Tile Acceleration on CPUs

no code implementations17 Feb 2023 Geonhwa Jeong, Sana Damani, Abhimanyu Rajeshkumar Bambhaniya, Eric Qin, Christopher J. Hughes, Sreenivas Subramoney, Hyesoon Kim, Tushar Krishna

Therefore, as DL workloads embrace sparsity to reduce the computations and memory size of models, it is also imperative for CPUs to add support for sparsity to avoid under-utilization of the dense matrix engine and inefficient usage of the caches and registers.

Robust 3D Scene Segmentation through Hierarchical and Learnable Part-Fusion

no code implementations16 Nov 2021 Anirud Thyagharajan, Benjamin Ummenhofer, Prashant Laddha, Om J Omer, Sreenivas Subramoney

3D semantic segmentation is a fundamental building block for several scene understanding applications such as autonomous driving, robotics and AR/VR.

3D Semantic Segmentation Autonomous Driving +4

RASA: Efficient Register-Aware Systolic Array Matrix Engine for CPU

no code implementations5 Oct 2021 Geonhwa Jeong, Eric Qin, Ananda Samajdar, Christopher J. Hughes, Sreenivas Subramoney, Hyesoon Kim, Tushar Krishna

As AI-based applications become pervasive, CPU vendors are starting to incorporate matrix engines within the datapath to boost efficiency.

Pythia: A Customizable Hardware Prefetching Framework Using Online Reinforcement Learning

2 code implementations24 Sep 2021 Rahul Bera, Konstantinos Kanellopoulos, Anant V. Nori, Taha Shahroodi, Sreenivas Subramoney, Onur Mutlu

In this paper, we make a case for designing a holistic prefetch algorithm that learns to prefetch using multiple different types of program context and system-level feedback information inherent to its design.

reinforcement-learning Reinforcement Learning (RL)

GenASM: A High-Performance, Low-Power Approximate String Matching Acceleration Framework for Genome Sequence Analysis

2 code implementations16 Sep 2020 Damla Senol Cali, Gurpreet S. Kalsi, Zülal Bingöl, Can Firtina, Lavanya Subramanian, Jeremie S. Kim, Rachata Ausavarungnirun, Mohammed Alser, Juan Gomez-Luna, Amirali Boroumand, Anant Nori, Allison Scibisz, Sreenivas Subramoney, Can Alkan, Saugata Ghose, Onur Mutlu

Unfortunately, it is currently bottlenecked by the computational power and memory bandwidth limitations of existing systems, as many of the steps in genome sequence analysis must process a large amount of data.

Hardware Architecture Genomics

Cannot find the paper you are looking for? You can Submit a new open access paper.