Search Results for author: Shaoyi Huang

Found 14 papers, 4 papers with code

Zero-Space Cost Fault Tolerance for Transformer-based Language Models on ReRAM

no code implementations • 22 Jan 2024 • Bingbing Li, Geng Yuan, Zigeng Wang, Shaoyi Huang, Hongwu Peng, Payman Behnam, Wujie Wen, Hang Liu, Caiwen Ding

Resistive Random Access Memory (ReRAM) has emerged as a promising platform for deep neural networks (DNNs) due to its support for parallel in-situ matrix-vector multiplication.

Paper
Add Code

MaxK-GNN: Extremely Fast GPU Kernel Design for Accelerating Graph Neural Networks Training

1 code implementation • 14 Dec 2023 • Hongwu Peng, Xi Xie, Kaustubh Shivdikar, MD Amit Hasan, Jiahui Zhao, Shaoyi Huang, Omer Khan, David Kaeli, Caiwen Ding

In this paper, we present MaxK-GNN, an advanced high-performance GPU training system integrating algorithm and system innovation.

Paper
Code

Accel-GCN: High-Performance GPU Accelerator Design for Graph Convolution Networks

1 code implementation • 22 Aug 2023 • Xi Xie, Hongwu Peng, Amit Hasan, Shaoyi Huang, Jiahui Zhao, Haowen Fang, Wei zhang, Tong Geng, Omer Khan, Caiwen Ding

Utilizing these principles, we formulated a kernel for sparse matrix multiplication (SpMM) in GCNs that employs block-level partitioning and combined warp strategy.

Computational Efficiency

Paper
Code

AutoReP: Automatic ReLU Replacement for Fast Private Network Inference

1 code implementation • ICCV 2023 • Hongwu Peng, Shaoyi Huang, Tong Zhou, Yukui Luo, Chenghong Wang, Zigeng Wang, Jiahui Zhao, Xi Xie, Ang Li, Tony Geng, Kaleel Mahmood, Wujie Wen, Xiaolin Xu, Caiwen Ding

The growth of the Machine-Learning-As-A-Service (MLaaS) market has highlighted clients' data privacy and security issues.

Paper
Code

Neurogenesis Dynamics-inspired Spiking Neural Network Training Acceleration

no code implementations • 24 Apr 2023 • Shaoyi Huang, Haowen Fang, Kaleel Mahmood, Bowen Lei, Nuo Xu, Bin Lei, Yue Sun, Dongkuan Xu, Wujie Wen, Caiwen Ding

Experimental results show that NDSNN achieves up to 20. 52\% improvement in accuracy on Tiny-ImageNet using ResNet-19 (with a sparsity of 99\%) as compared to other SOTA methods (e. g., Lottery Ticket Hypothesis (LTH), SET-SNN, RigL-SNN).

Paper
Add Code

RRNet: Towards ReLU-Reduced Neural Network for Two-party Computation Based Private Inference

no code implementations • 5 Feb 2023 • Hongwu Peng, Shanglin Zhou, Yukui Luo, Nuo Xu, Shijin Duan, Ran Ran, Jiahui Zhao, Shaoyi Huang, Xi Xie, Chenghong Wang, Tong Geng, Wujie Wen, Xiaolin Xu, Caiwen Ding

The proliferation of deep learning (DL) has led to the emergence of privacy and security concerns.

Privacy Preserving

Paper
Add Code

Dynamic Sparse Training via Balancing the Exploration-Exploitation Trade-off

no code implementations • 30 Nov 2022 • Shaoyi Huang, Bowen Lei, Dongkuan Xu, Hongwu Peng, Yue Sun, Mimi Xie, Caiwen Ding

We further design an acquisition function and provide the theoretical guarantees for the proposed method and clarify its convergence property.

Paper
Add Code

Efficient Traffic State Forecasting using Spatio-Temporal Network Dependencies: A Sparse Graph Neural Network Approach

no code implementations • 6 Nov 2022 • Bin Lei, Shaoyi Huang, Caiwen Ding, Monika Filipovska

We consider the problem of long-term traffic speed forecasting for a real large-scale transportation network data from the California Department of Transportation (Caltrans) Performance Measurement System (PeMS).

Decision Making Graph Attention +2

Paper
Add Code

PolyMPCNet: Towards ReLU-free Neural Architecture Search in Two-party Computation Based Private Inference

no code implementations • 20 Sep 2022 • Hongwu Peng, Shanglin Zhou, Yukui Luo, Shijin Duan, Nuo Xu, Ran Ran, Shaoyi Huang, Chenghong Wang, Tong Geng, Ang Li, Wujie Wen, Xiaolin Xu, Caiwen Ding

The rapid growth and deployment of deep learning (DL) has witnessed emerging privacy and security concerns.

Neural Architecture Search Privacy Preserving

Paper
Add Code

Towards Sparsification of Graph Neural Networks

1 code implementation • 11 Sep 2022 • Hongwu Peng, Deniz Gurevin, Shaoyi Huang, Tong Geng, Weiwen Jiang, Omer Khan, Caiwen Ding

In this paper, we utilize two state-of-the-art model compression methods (1) train and prune and (2) sparse training for the sparsification of weight layers in GNNs.

Image Classification Link Prediction +4

Paper
Code

A Length Adaptive Algorithm-Hardware Co-design of Transformer on FPGA Through Sparse Attention and Dynamic Pipelining

no code implementations • 7 Aug 2022 • Hongwu Peng, Shaoyi Huang, Shiyang Chen, Bingbing Li, Tong Geng, Ang Li, Weiwen Jiang, Wujie Wen, Jinbo Bi, Hang Liu, Caiwen Ding

Particularly, we develop a hardware-friendly sparse attention operator and a length-aware hardware resource scheduling algorithm.

Scheduling Sentence

Paper
Add Code

An Automatic and Efficient BERT Pruning for Edge AI Systems

no code implementations • 21 Jun 2022 • Shaoyi Huang, Ning Liu, Yueying Liang, Hongwu Peng, Hongjia Li, Dongkuan Xu, Mimi Xie, Caiwen Ding

On MRPC, we obtain a 4. 6 higher score than the SOTA at the same overall pruning ratio of 0. 5.

Model Compression MRPC +4

Paper
Add Code

Accelerating Framework of Transformer by Hardware Design and Model Compression Co-Optimization

no code implementations • 19 Oct 2021 • Panjie Qi, Edwin Hsing-Mean Sha, Qingfeng Zhuge, Hongwu Peng, Shaoyi Huang, Zhenglun Kong, Yuhong Song, Bingbing Li

Our HP can achieve higher sparsity ratio and is more flexible than other sparsity pattern.

Model Compression

Paper
Add Code

Sparse Progressive Distillation: Resolving Overfitting under Pretrain-and-Finetune Paradigm

no code implementations • ACL 2022 • Shaoyi Huang, Dongkuan Xu, Ian E. H. Yen, Yijue Wang, Sung-En Chang, Bingbing Li, Shiyang Chen, Mimi Xie, Sanguthevar Rajasekaran, Hang Liu, Caiwen Ding

Conventional wisdom in pruning Transformer-based language models is that pruning reduces the model expressiveness and thus is more likely to underfit rather than overfit.

Knowledge Distillation

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.