no code implementations • 10 Apr 2024 • Bingyi Zhang, Rajgopal Kannan, Carl Busart, Viktor Prasanna
Moreover, GCV-Turbo supports the execution of the standalone CNNs or GNNs, achieving performance comparable to that of state-of-the-art CNN (GNN) accelerators for widely used CNN-only (GNN-only) models.
1 code implementation • 8 Apr 2024 • Neelesh Gupta, Narayanan Kannan, Pengmiao Zhang, Viktor Prasanna
TabConv preserves over 93% of the original model's performance while reducing arithmetic operations by 36. 5%, 25. 8%, and 99. 4% for ResNet-18 on CIFAR-10, CIFAR-100, and MNIST, respectively, 35. 6% and 99. 3% for ResNet-34 on CIFAR-10 and MNIST, and 98. 9% for NIN on MNIST, achieving low-computation inference.
no code implementations • 6 Apr 2024 • Sachini Wickramasinghe, Dhruv Parikh, Bingyi Zhang, Rajgopal Kannan, Viktor Prasanna, Carl Busart
We directly train this model on SAR datasets which have limited training samples to evaluate its effectiveness for SAR ATR applications.
no code implementations • 4 Apr 2024 • Xu Wang, Tian Ye, Rajgopal Kannan, Viktor Prasanna
FACTUAL consists of two components: (1) Differing from existing works, a novel perturbation scheme that incorporates realistic physical adversarial attacks (such as OTSA) to build a supervised adversarial pre-training network.
no code implementations • 27 Mar 2024 • Tian Ye, Rajgopal Kannan, Viktor Prasanna, Carl Busart
Adversarial attacks have demonstrated the vulnerability of Machine Learning (ML) image classifiers in Synthetic Aperture Radar (SAR) Automatic Target Recognition (ATR) systems.
no code implementations • 21 Mar 2024 • Dhruv Parikh, Shouyi Li, Bingyi Zhang, Rajgopal Kannan, Carl Busart, Viktor Prasanna
For algorithm design, we systematically combine a hardware-aware structured block-pruning method for pruning model parameters and a dynamic token pruning method for removing unimportant token vectors.
1 code implementation • 21 Feb 2024 • Neelesh Gupta, Pengmiao Zhang, Rajgopal Kannan, Viktor Prasanna
Deep neural networks (DNNs) have proven to be effective models for accurate Memory Access Prediction (MAP), a critical task in mitigating memory latency through data prefetching.
no code implementations • 15 Feb 2024 • Kyle Marino, Pengmiao Zhang, Viktor Prasanna
We evaluate ME-ViT on systolic array sizes of 32 and 16, achieving up to a 9. 22$\times$ and 17. 89$\times$ overall improvement in memory bandwidth, and a 2. 16$\times$ improvement in throughput per DSP for both designs over state-of-the-art ViT accelerators on FPGA.
1 code implementation • 8 Feb 2024 • Gangda Deng, Hongkuan Zhou, Hanqing Zeng, Yinglong Xia, Christopher Leung, Jianbo Li, Rajgopal Kannan, Viktor Prasanna
Recently, Temporal Graph Neural Networks (TGNNs) have demonstrated state-of-the-art performance in various high-impact applications, including fraud detection and content recommendation.
1 code implementation • 1 Feb 2024 • Jacob Fein-Ashley, Tian Ye, Sachini Wickramasinghe, Bingyi Zhang, Rajgopal Kannan, Viktor Prasanna
Our experimental results on benchmark grayscale image datasets demonstrate the effectiveness of the proposed model, achieving vastly lower latency (up to 16$\times$ less) and competitive or leading performance compared to other state-of-the-art image classification models on various domain-specific grayscale image classification datasets.
Ranked #15 on Image Classification on Fashion-MNIST
no code implementations • 5 Jan 2024 • Sasindu Wijeratne, Bingyi Zhang, Rajgopal Kannan, Viktor Prasanna, Carl Busart
This detailed information includes the SAR image features that contributed to the classification, the classification confidence, and the probability of the identified object being classified as a different object type or class.
no code implementations • 12 Dec 2023 • Jacob Fein-Ashley, Tian Ye, Rajgopal Kannan, Viktor Prasanna, Carl Busart
Synthetic Aperture Radar SAR Automatic Target Recognition ATR is a key technique of remote-sensing image recognition which can be supported by deep neural networks The existing works of SAR ATR mostly focus on improving the accuracy of the target recognition while ignoring the systems performance in terms of speed and storage which is critical to real-world applications of SAR ATR For decision-makers aiming to identify a proper deep learning model to deploy in a SAR ATR system it is important to understand the performance of different candidate deep learning models and determine the best model accordingly This paper comprehensively benchmarks several advanced deep learning models for SAR ATR with multiple distinct SAR imagery datasets Specifically we train and test five SAR image classifiers based on Residual Neural Networks ResNet18 ResNet34 ResNet50 Graph Neural Network GNN and Vision Transformer for Small-Sized Datasets (SS-ViT) We select three datasets MSTAR GBSAR and SynthWakeSAR that offer heterogeneity We evaluate and compare the five classifiers concerning their classification accuracy runtime performance in terms of inference throughput and analytical performance in terms of number of parameters number of layers model size and number of operations Experimental results show that the GNN classifier outperforms with respect to throughput and latency However it is also shown that no clear model winner emerges from all of our chosen metrics and a one model rules all case is doubtful in the domain of SAR ATR
no code implementations • 5 Dec 2023 • Tian Ye, Rajgopal Kannan, Viktor Prasanna, Carl Busart, Lance Kaplan
Instead, adversarial attacks should be able to be implemented by physical actions, for example, placing additional false objects as scatterers around the on-ground target to perturb the SAR image and fool the SAR ATR.
no code implementations • 13 Sep 2023 • Samuel Wiggins, Yuan Meng, Rajgopal Kannan, Viktor Prasanna
Multi-Agent Reinforcement Learning (MARL) has achieved significant success in large-scale AI systems and big-data applications such as smart grids, surveillance, etc.
no code implementations • 4 Aug 2023 • Paul Chen, Pavan Manjunath, Sasindu Wijeratne, Bingyi Zhang, Viktor Prasanna
To exploit data sparsity during inference, we devise a runtime kernel mapping strategy that dynamically assigns computation tasks to the PL and AIE based on data sparsity.
no code implementations • 14 Jul 2023 • Hongkuan Zhou, Da Zheng, Xiang Song, George Karypis, Viktor Prasanna
Evenworse, the tremendous overhead to synchronize the node memory make it impractical to be deployed to distributed GPU clusters.
no code implementations • 11 May 2023 • Bingyi Zhang, Sasindu Wijeratne, Rajgopal Kannan, Viktor Prasanna, Carl Busart
In this work, we propose a graph neural network (GNN) model to achieve accurate and low-latency SAR ATR.
no code implementations • 4 Jan 2023 • Bingyi Zhang, Rajgopal Kannan, Viktor Prasanna, Carl Busart
Compared with the state-of-the-art CNNs, the proposed GNN achieves comparable accuracy with $1/3258$ computation cost and $1/83$ model size.
no code implementations • 2 Sep 2022 • Diyi Hu, Chi Zhang, Viktor Prasanna, Bhaskar, Krishnamachari
In Multi-Agent Reinforcement Learning, communication is critical to encourage cooperation among agents.
Multi-agent Reinforcement Learning reinforcement-learning +1
no code implementations • 17 Jul 2022 • Sasindu Wijeratne, Ta-Yang Wang, Rajgopal Kannan, Viktor Prasanna
Implementing accelerators on Field Programmable Gate Array (FPGA) for kernels such as MTTKRP is attractive due to the energy efficiency and the inherent parallelism of FPGA.
1 code implementation • 10 Mar 2022 • Hongkuan Zhou, Bingyi Zhang, Rajgopal Kannan, Viktor Prasanna, Carl Busart
Taking advantage of the model optimizations, we propose a principled hardware architecture using batching, pipelining, and prefetching techniques to further improve the performance.
1 code implementation • NeurIPS 2021 • Hanqing Zeng, Muhan Zhang, Yinglong Xia, Ajitesh Srivastava, Andrey Malevich, Rajgopal Kannan, Viktor Prasanna, Long Jin, Ren Chen
We propose a design principle to decouple the depth and scope of GNNs -- to generate representation of a target entity (i. e., a node or an edge), we first extract a localized subgraph as the bounded-size scope, and then apply a GNN of arbitrary depth on top of the subgraph.
Ranked #3 on Node Classification on Reddit
no code implementations • 18 Sep 2021 • Sasindu Wijeratne, Rajgopal Kannan, Viktor Prasanna
This paper focuses on a multi-faceted memory system, which explores the spatial and temporal locality of the data structures of MTTKRP.
1 code implementation • 9 Sep 2021 • Hongkuan Zhou, James Orme-Rogers, Rajgopal Kannan, Viktor Prasanna
SeDyT consists of two components: a Temporal Graph Neural Network that generates dynamic entity embeddings in the past and a sequence model that predicts the entity embeddings in the future.
no code implementations • 21 Aug 2021 • Sasindu Wijeratne, Sanket Pattnaik, Zhiyu Chen, Rajgopal Kannan, Viktor Prasanna
Since developing memory controllers for different applications is time-consuming, this paper introduces a modular and programmable memory controller that can be configured for different target applications on available hardware resources.
1 code implementation • 10 May 2021 • Hongkuan Zhou, Ajitesh Srivastava, Hanqing Zeng, Rajgopal Kannan, Viktor Prasanna
In this paper, we propose to accelerate GNN inference by pruning the dimensions in each layer with negligible accuracy loss.
no code implementations • 1 Jan 2021 • Chi Zhang, Sanmukh Rao Kuppannagari, Viktor Prasanna
The goal of Offline Reinforcement Learning (RL) is to address this problem by learning effective policies using previously collected datasets.
2 code implementations • 2 Dec 2020 • Hanqing Zeng, Muhan Zhang, Yinglong Xia, Ajitesh Srivastava, Andrey Malevich, Rajgopal Kannan, Viktor Prasanna, Long Jin, Ren Chen
We propose a simple "deep GNN, shallow sampler" design principle to improve both the GNN accuracy and efficiency -- to generate representation of a target node, we use a deep GNN to pass messages only within a shallow, localized subgraph.
2 code implementations • 5 Oct 2020 • Hanqing Zeng, Hongkuan Zhou, Ajitesh Srivastava, Rajgopal Kannan, Viktor Prasanna
For feature propagation within subgraphs, we improve cache utilization and reduce DRAM traffic by data partitioning.
1 code implementation • 31 Dec 2019 • Hanqing Zeng, Viktor Prasanna
We first analyze the computation and communication characteristics of various GCN training algorithms, and select a subgraph-based algorithm that is well suited for hardware execution.
no code implementations • 16 Oct 2019 • Yue Niu, Hanqing Zeng, Ajitesh Srivastava, Kartik Lakhotia, Rajgopal Kannan, Yanzhi Wang, Viktor Prasanna
On the other hand, weight pruning techniques address the redundancy in model parameters by converting dense convolutional kernels into sparse ones.
7 code implementations • ICLR 2020 • Hanqing Zeng, Hongkuan Zhou, Ajitesh Srivastava, Rajgopal Kannan, Viktor Prasanna
Graph Convolutional Networks (GCNs) are powerful models for learning representations of attributed graphs.
Ranked #1 on Link Property Prediction on ogbl-citation2
2 code implementations • 28 Oct 2018 • Hanqing Zeng, Hongkuan Zhou, Ajitesh Srivastava, Rajgopal Kannan, Viktor Prasanna
However, a major challenge is to reduce the complexity of layered GCNs and make them parallelizable and scalable on very large graphs -- state-of the art techniques are unable to achieve scalability without losing accuracy and efficiency.
1 code implementation • 21 Sep 2017 • Kartik Lakhotia, Rajgopal Kannan, Viktor Prasanna
The traditional PageRank implementation generates fine granularity random memory accesses resulting in large amount of wasteful DRAM traffic and poor bandwidth utilization.
Distributed, Parallel, and Cluster Computing Data Structures and Algorithms Performance