Search Results for author: Caiwen Ding

Found 76 papers, 14 papers with code

Weakly Supervised Change Detection via Knowledge Distillation and Multiscale Sigmoid Inference

1 code implementation • 9 Mar 2024 • Binghao Lu, Caiwen Ding, Jinbo Bi, Dongjin Song

Moreover, we designed a Multiscale Sigmoid Inference (MSI) module as a post processing step to further refine the change probability map from the trained student network.

Change Detection Knowledge Distillation +1

Paper
Code

Key Information Retrieval to Classify the Unstructured Data Content of Preferential Trade Agreements

no code implementations • 23 Jan 2024 • Jiahui Zhao, Ziyi Meng, Stepan Gordeev, Zijie Pan, Dongjin Song, Sandro Steinbach, Caiwen Ding

To address this issue, we introduce a novel approach to long-text classification and prediction.

Information Retrieval Retrieval +2

Paper
Add Code

Zero-Space Cost Fault Tolerance for Transformer-based Language Models on ReRAM

no code implementations • 22 Jan 2024 • Bingbing Li, Geng Yuan, Zigeng Wang, Shaoyi Huang, Hongwu Peng, Payman Behnam, Wujie Wen, Hang Liu, Caiwen Ding

Resistive Random Access Memory (ReRAM) has emerged as a promising platform for deep neural networks (DNNs) due to its support for parallel in-situ matrix-vector multiplication.

Paper
Add Code

FlashVideo: A Framework for Swift Inference in Text-to-Video Generation

no code implementations • 30 Dec 2023 • Bin Lei, Le Chen, Caiwen Ding

In the evolving field of machine learning, video generation has witnessed significant advancements with autoregressive-based transformer models and diffusion models, known for synthesizing dynamic and realistic scenes.

Text-to-Video Generation Video Generation

Paper
Add Code

MaxK-GNN: Extremely Fast GPU Kernel Design for Accelerating Graph Neural Networks Training

1 code implementation • 14 Dec 2023 • Hongwu Peng, Xi Xie, Kaustubh Shivdikar, MD Amit Hasan, Jiahui Zhao, Shaoyi Huang, Omer Khan, David Kaeli, Caiwen Ding

In this paper, we present MaxK-GNN, an advanced high-performance GPU training system integrating algorithm and system innovation.

Paper
Code

Advanced Large Language Model (LLM)-Driven Verilog Development: Enhancing Power, Performance, and Area Optimization in Code Synthesis

no code implementations • 2 Dec 2023 • Kiran Thorat, Jiahui Zhao, Yaotian Liu, Hongwu Peng, Xi Xie, Bin Lei, Jeff Zhang, Caiwen Ding

The increasing use of Advanced Language Models (ALMs) in diverse sectors, particularly due to their impressive capability to generate top-tier content following linguistic instructions, forms the core of this investigation.

Language Modelling Large Language Model

Paper
Add Code

Evaluating Emerging AI/ML Accelerators: IPU, RDU, and NVIDIA/AMD GPUs

no code implementations • 8 Nov 2023 • Hongwu Peng, Caiwen Ding, Tong Geng, Sutanay Choudhury, Kevin Barker, Ang Li

The relentless advancement of artificial intelligence (AI) and machine learning (ML) applications necessitates the development of specialized hardware accelerators capable of handling the increasing complexity and computational demands.

Paper
Add Code

DeeDiff: Dynamic Uncertainty-Aware Early Exiting for Accelerating Diffusion Model Generation

no code implementations • 29 Sep 2023 • Shengkun Tang, Yaqing Wang, Caiwen Ding, Yi Liang, Yao Li, Dongkuan Xu

In this work, we propose DeeDiff, an early exiting framework that adaptively allocates computation resources in each sampling step to improve the generation efficiency of diffusion models.

text-guided-generation

Paper
Add Code

Accel-GCN: High-Performance GPU Accelerator Design for Graph Convolution Networks

1 code implementation • 22 Aug 2023 • Xi Xie, Hongwu Peng, Amit Hasan, Shaoyi Huang, Jiahui Zhao, Haowen Fang, Wei zhang, Tong Geng, Omer Khan, Caiwen Ding

Utilizing these principles, we formulated a kernel for sparse matrix multiplication (SpMM) in GCNs that employs block-level partitioning and combined warp strategy.

Computational Efficiency

Paper
Code

AutoReP: Automatic ReLU Replacement for Fast Private Network Inference

1 code implementation • ICCV 2023 • Hongwu Peng, Shaoyi Huang, Tong Zhou, Yukui Luo, Chenghong Wang, Zigeng Wang, Jiahui Zhao, Xi Xie, Ang Li, Tony Geng, Kaleel Mahmood, Wujie Wen, Xiaolin Xu, Caiwen Ding

The growth of the Machine-Learning-As-A-Service (MLaaS) market has highlighted clients' data privacy and security issues.

Paper
Code

Boosting Logical Reasoning in Large Language Models through a New Framework: The Graph of Thought

no code implementations • 16 Aug 2023 • Bin Lei, Pei-Hung Lin, Chunhua Liao, Caiwen Ding

Recent advancements in large-scale models, such as GPT-4, have showcased remarkable capabilities in addressing standard queries.

Logical Reasoning

Paper
Add Code

Towards Zero Memory Footprint Spiking Neural Network Training

no code implementations • 16 Aug 2023 • Bin Lei, Sheng Lin, Pei-Hung Lin, Chunhua Liao, Caiwen Ding

Our design is able to achieve a $\mathbf{58. 65\times}$ reduction in memory usage compared to the current SNN node.

Paper
Add Code

Tango: rethinking quantization for graph neural network training on GPUs

no code implementations • 2 Aug 2023 • Shiyang Chen, Da Zheng, Caiwen Ding, Chengying Huan, Yuede Ji, Hang Liu

Graph Neural Networks (GNNs) are becoming increasingly popular due to their superior performance in critical graph-related tasks.

Quantization

Paper
Add Code

Spectral-DP: Differentially Private Deep Learning through Spectral Perturbation and Filtering

no code implementations • 25 Jul 2023 • Ce Feng, Nuo Xu, Wujie Wen, Parv Venkitasubramaniam, Caiwen Ding

In particular, for fully connected layers, we combine a block-circulant based spatial restructuring with Spectral-DP to achieve better utility.

Transfer Learning

Paper
Add Code

Creating a Dataset for High-Performance Computing Code Translation using LLMs: A Bridge Between OpenMP Fortran and C++

1 code implementation • 15 Jul 2023 • Bin Lei, Caiwen Ding, Le Chen, Pei-Hung Lin, Chunhua Liao

In this study, we present a novel dataset for training machine learning models translating between OpenMP Fortran and C++ code.

C++ code Code Translation +1

Paper
Code

Multi-Task Models Adversarial Attacks

1 code implementation • 20 May 2023 • Lijun Zhang, Xiao Liu, Kaleel Mahmood, Caiwen Ding, Hui Guan

We then introduce a novel attack framework, the Gradient Balancing Multi-Task Attack (GB-MTA), which treats attacking a multi-task model as an optimization problem.

Multi-Task Learning

Paper
Code

Neurogenesis Dynamics-inspired Spiking Neural Network Training Acceleration

no code implementations • 24 Apr 2023 • Shaoyi Huang, Haowen Fang, Kaleel Mahmood, Bowen Lei, Nuo Xu, Bin Lei, Yue Sun, Dongkuan Xu, Wujie Wen, Caiwen Ding

Experimental results show that NDSNN achieves up to 20. 52\% improvement in accuracy on Tiny-ImageNet using ResNet-19 (with a sparsity of 99\%) as compared to other SOTA methods (e. g., Lottery Ticket Hypothesis (LTH), SET-SNN, RigL-SNN).

Paper
Add Code

Surrogate Lagrangian Relaxation: A Path To Retrain-free Deep Neural Network Pruning

no code implementations • 8 Apr 2023 • Shanglin Zhou, Mikhail A. Bragin, Lynn Pepin, Deniz Gurevin, Fei Miao, Caiwen Ding

We evaluate our method on image classification tasks using CIFAR-10 and ImageNet with state-of-the-art MLP-Mixer, Swin Transformer, and VGG-16, ResNet-18, ResNet-50 and ResNet-110, MobileNetV2.

Image Classification Lane Detection +4

Paper
Add Code

Physics-aware Roughness Optimization for Diffractive Optical Neural Networks

no code implementations • 4 Apr 2023 • Shanglin Zhou, Yingjie Li, Minhan Lou, Weilu Gao, Zhijie Shi, Cunxi Yu, Caiwen Ding

As a representative next-generation device/circuit technology beyond CMOS, diffractive optical neural networks (DONNs) have shown promising advantages over conventional deep neural networks due to extreme fast computation speed (light speed) and low energy consumption.

Paper
Add Code

Collaborative Multi-Object Tracking with Conformal Uncertainty Propagation

no code implementations • 25 Mar 2023 • Sanbao Su, Songyang Han, Yiming Li, Zhili Zhang, Chen Feng, Caiwen Ding, Fei Miao

MOT-CUP demonstrates the importance of uncertainty quantification in both COD and MOT, and provides the first attempt to improve the accuracy and reduce the uncertainty in MOT based on COD through uncertainty propagation.

Autonomous Vehicles Conformal Prediction +7

Paper
Add Code

Shared Information-Based Safe And Efficient Behavior Planning For Connected Autonomous Vehicles

no code implementations • 8 Feb 2023 • Songyang Han, Shanglin Zhou, Lynn Pepin, Jiangwei Wang, Caiwen Ding, Fei Miao

The recent advancements in wireless technology enable connected autonomous vehicles (CAVs) to gather data via vehicle-to-vehicle (V2V) communication, such as processed LIDAR and camera data from other vehicles.

Autonomous Vehicles Multi-agent Reinforcement Learning

Paper
Add Code

RRNet: Towards ReLU-Reduced Neural Network for Two-party Computation Based Private Inference

no code implementations • 5 Feb 2023 • Hongwu Peng, Shanglin Zhou, Yukui Luo, Nuo Xu, Shijin Duan, Ran Ran, Jiahui Zhao, Shaoyi Huang, Xi Xie, Chenghong Wang, Tong Geng, Wujie Wen, Xiaolin Xu, Caiwen Ding

The proliferation of deep learning (DL) has led to the emergence of privacy and security concerns.

Privacy Preserving

Paper
Add Code

Accelerating Dataset Distillation via Model Augmentation

2 code implementations • CVPR 2023 • Lei Zhang, Jie Zhang, Bowen Lei, Subhabrata Mukherjee, Xiang Pan, Bo Zhao, Caiwen Ding, Yao Li, Dongkuan Xu

Dataset Distillation (DD), a newly emerging field, aims at generating much smaller but efficient synthetic training datasets from large ones.

1,153

Paper
Code

All-in-One: A Highly Representative DNN Pruning Framework for Edge Devices with Dynamic Power Management

no code implementations • 9 Dec 2022 • Yifan Gong, Zheng Zhan, Pu Zhao, Yushu Wu, Chao Wu, Caiwen Ding, Weiwen Jiang, Minghai Qin, Yanzhi Wang

By re-configuring the model to the corresponding pruning ratio for a specific execution frequency (and voltage), we are able to achieve stable inference speed, i. e., keeping the difference in speed performance under various execution frequencies as small as possible.

Management

Paper
Add Code

Dynamic Sparse Training via Balancing the Exploration-Exploitation Trade-off

no code implementations • 30 Nov 2022 • Shaoyi Huang, Bowen Lei, Dongkuan Xu, Hongwu Peng, Yue Sun, Mimi Xie, Caiwen Ding

We further design an acquisition function and provide the theoretical guarantees for the proposed method and clarify its convergence property.

Paper
Add Code

Game Theoretic Mixed Experts for Combinational Adversarial Machine Learning

1 code implementation • 26 Nov 2022 • Ethan Rathbun, Kaleel Mahmood, Sohaib Ahmad, Caiwen Ding, Marten van Dijk

First, how can the low transferability between defenses be utilized in a game theoretic framework to improve the robustness?

Adversarial Defense

Paper
Code

You Need Multiple Exiting: Dynamic Early Exiting for Accelerating Unified Vision Language Model

1 code implementation • CVPR 2023 • Shengkun Tang, Yaqing Wang, Zhenglun Kong, Tianchi Zhang, Yao Li, Caiwen Ding, Yanzhi Wang, Yi Liang, Dongkuan Xu

To handle this challenge, we propose a novel early exiting strategy for unified visual language models, which allows dynamically skip the layers in encoder and decoder simultaneously in term of input layer-wise similarities with multiple times of early exiting, namely \textbf{MuE}.

Language Modelling

2,319

Paper
Code

Efficient Traffic State Forecasting using Spatio-Temporal Network Dependencies: A Sparse Graph Neural Network Approach

no code implementations • 6 Nov 2022 • Bin Lei, Shaoyi Huang, Caiwen Ding, Monika Filipovska

We consider the problem of long-term traffic speed forecasting for a real large-scale transportation network data from the California Department of Transportation (Caltrans) Performance Measurement System (PeMS).

Decision Making Graph Attention +2

Paper
Add Code

Towards Real-Time Temporal Graph Learning

1 code implementation • 8 Oct 2022 • Deniz Gurevin, Mohsin Shan, Tong Geng, Weiwen Jiang, Caiwen Ding, Omer Khan

Prior work operates on pre-collected temporal graph data and is not designed to handle updates on a graph in real-time.

graph construction Graph Learning +3

Paper
Code

PolyMPCNet: Towards ReLU-free Neural Architecture Search in Two-party Computation Based Private Inference

no code implementations • 20 Sep 2022 • Hongwu Peng, Shanglin Zhou, Yukui Luo, Shijin Duan, Nuo Xu, Ran Ran, Shaoyi Huang, Chenghong Wang, Tong Geng, Ang Li, Wujie Wen, Xiaolin Xu, Caiwen Ding

The rapid growth and deployment of deep learning (DL) has witnessed emerging privacy and security concerns.

Neural Architecture Search Privacy Preserving

Paper
Add Code

Uncertainty Quantification of Collaborative Detection for Self-Driving

1 code implementation • 16 Sep 2022 • Sanbao Su, Yiming Li, Sihong He, Songyang Han, Chen Feng, Caiwen Ding, Fei Miao

Our work is the first to estimate the uncertainty of collaborative object detection.

Autonomous Vehicles Object +3

Paper
Code

Towards Sparsification of Graph Neural Networks

1 code implementation • 11 Sep 2022 • Hongwu Peng, Deniz Gurevin, Shaoyi Huang, Tong Geng, Weiwen Jiang, Omer Khan, Caiwen Ding

In this paper, we utilize two state-of-the-art model compression methods (1) train and prune and (2) sparse training for the sparsification of weight layers in GNNs.

Image Classification Link Prediction +4

Paper
Code

Attacking the Spike: On the Transferability and Security of Spiking Neural Networks to Adversarial Examples

no code implementations • 7 Sep 2022 • Nuo Xu, Kaleel Mahmood, Haowen Fang, Ethan Rathbun, Caiwen Ding, Wujie Wen

First, we show that successful white-box adversarial attacks on SNNs are highly dependent on the underlying surrogate gradient technique, even in the case of adversarially trained SNNs.

Adversarial Attack

Paper
Add Code

A Length Adaptive Algorithm-Hardware Co-design of Transformer on FPGA Through Sparse Attention and Dynamic Pipelining

no code implementations • 7 Aug 2022 • Hongwu Peng, Shaoyi Huang, Shiyang Chen, Bingbing Li, Tong Geng, Ang Li, Weiwen Jiang, Wujie Wen, Jinbo Bi, Hang Liu, Caiwen Ding

Particularly, we develop a hardware-friendly sparse attention operator and a length-aware hardware resource scheduling algorithm.

Scheduling Sentence

Paper
Add Code

EVE: Environmental Adaptive Neural Network Models for Low-power Energy Harvesting System

no code implementations • 14 Jul 2022 • Sahidul Islam, Shanglin Zhou, Ran Ran, Yufang Jin, Wujie Wen, Caiwen Ding, Mimi Xie

Energy harvesting (EH) technology that harvests energy from ambient environment is a promising alternative to batteries for powering those devices due to the low maintenance cost and wide availability of the energy sources.

AutoML Model extraction

Paper
Add Code

An Automatic and Efficient BERT Pruning for Edge AI Systems

no code implementations • 21 Jun 2022 • Shaoyi Huang, Ning Liu, Yueying Liang, Hongwu Peng, Hongjia Li, Dongkuan Xu, Mimi Xie, Caiwen Ding

On MRPC, we obtain a 4. 6 higher score than the SOTA at the same overall pruning ratio of 0. 5.

Model Compression MRPC +4

Paper
Add Code

A Secure and Efficient Federated Learning Framework for NLP

no code implementations • EMNLP 2021 • Jieren Deng, Chenghong Wang, Xianrui Meng, Yijue Wang, Ji Li, Sheng Lin, Shuo Han, Fei Miao, Sanguthevar Rajasekaran, Caiwen Ding

In this work, we consider the problem of designing secure and efficient federated learning (FL) frameworks.

Federated Learning

Paper
Add Code

Enabling Fast Deep Learning on Tiny Energy-Harvesting IoT Devices

no code implementations • 28 Nov 2021 • Sahidul Islam, Jieren Deng, Shanglin Zhou, Chen Pan, Caiwen Ding, Mimi Xie

Energy harvesting (EH) IoT devices that operate intermittently without batteries, coupled with advances in deep neural networks (DNNs), have opened up new opportunities for enabling sustainable smart applications.

Quantization

Paper
Add Code

Detecting Gender Bias in Transformer-based Models: A Case Study on BERT

no code implementations • 15 Oct 2021 • Bingbing Li, Hongwu Peng, Rajat Sainju, Junhuan Yang, Lei Yang, Yueying Liang, Weiwen Jiang, Binghui Wang, Hang Liu, Caiwen Ding

In this paper, we propose a novel gender bias detection method by utilizing attention map for transformer-based models.

Bias Detection Gender Bias Detection

Paper
Add Code

Sparse Progressive Distillation: Resolving Overfitting under Pretrain-and-Finetune Paradigm

no code implementations • ACL 2022 • Shaoyi Huang, Dongkuan Xu, Ian E. H. Yen, Yijue Wang, Sung-En Chang, Bingbing Li, Shiyang Chen, Mimi Xie, Sanguthevar Rajasekaran, Hang Liu, Caiwen Ding

Conventional wisdom in pruning Transformer-based language models is that pruning reduces the model expressiveness and thus is more likely to underfit rather than overfit.

Knowledge Distillation

Paper
Add Code

Dr. Top-k: Delegate-Centric Top-k on GPUs

1 code implementation • 16 Sep 2021 • Anil Gaihre, Da Zheng, Scott Weitze, Lingda Li, Shuaiwen Leon Song, Caiwen Ding, Xiaoye S Li, Hang Liu

Recent top-$k$ computation efforts explore the possibility of revising various sorting algorithms to answer top-$k$ queries on GPUs.

Paper
Code

Exploration of Quantum Neural Architecture by Mixing Quantum Neuron Designs

no code implementations • 8 Sep 2021 • Zhepeng Wang, Zhiding Liang, Shanglin Zhou, Caiwen Ding, Yiyu Shi, Weiwen Jiang

Experimental results demonstrate that the identified quantum neural architectures with mixed quantum neurons can achieve 90. 62% of accuracy on the MNIST dataset, compared with 52. 77% and 69. 92% on the VQC and QuantumFlow, respectively.

Paper
Add Code

Binary Complex Neural Network Acceleration on FPGA

no code implementations • 10 Aug 2021 • Hongwu Peng, Shanglin Zhou, Scott Weitze, Jiaxin Li, Sahidul Islam, Tong Geng, Ang Li, Wei zhang, Minghu Song, Mimi Xie, Hang Liu, Caiwen Ding

Deep complex networks (DCN), in contrast, can learn from complex data, but have high computational costs; therefore, they cannot satisfy the instant decision-making requirements of many deployable systems dealing with short observations or short signal bursts.

Decision Making

Paper
Add Code

FORMS: Fine-grained Polarized ReRAM-based In-situ Computation for Mixed-signal DNN Accelerator

no code implementations • 16 Jun 2021 • Geng Yuan, Payman Behnam, Zhengang Li, Ali Shafiee, Sheng Lin, Xiaolong Ma, Hang Liu, Xuehai Qian, Mahdi Nazm Bojnordi, Yanzhi Wang, Caiwen Ding

With weights stored in the ReRAM crossbar cells as conductance, when the input vector is applied to word lines, the matrix-vector multiplication results can be generated as the current in bit lines.

Paper
Add Code

A Compression-Compilation Framework for On-mobile Real-time BERT Applications

no code implementations • 30 May 2021 • Wei Niu, Zhenglun Kong, Geng Yuan, Weiwen Jiang, Jiexiong Guan, Caiwen Ding, Pu Zhao, Sijia Liu, Bin Ren, Yanzhi Wang

In this paper, we propose a compression-compilation co-design framework that can guarantee the identified model to meet both resource and real-time specifications of mobile devices.

Question Answering Text Generation

Paper
Add Code

TAG: Gradient Attack on Transformer-based Language Models

1 code implementation • Findings (EMNLP) 2021 • Jieren Deng, Yijue Wang, Ji Li, Chao Shang, Cao Qin, Hang Liu, Sanguthevar Rajasekaran, Caiwen Ding

In this paper, as the first attempt, we formulate the gradient attack problem on the Transformer-based language models and propose a gradient attack algorithm, TAG, to reconstruct the local training data.

Federated Learning Cryptography and Security

239

Paper
Code

Dancing along Battery: Enabling Transformer with Run-time Reconfigurability on Mobile Devices

no code implementations • 12 Feb 2021 • Yuhong Song, Weiwen Jiang, Bingbing Li, Panjie Qi, Qingfeng Zhuge, Edwin Hsing-Mean Sha, Sakyasingha Dasgupta, Yiyu Shi, Caiwen Ding

Specifically, RT3 integrates two-level optimizations: First, it utilizes an efficient BP as the first-step compression for resource-constrained mobile devices; then, RT3 heuristically generates a shrunken search space based on the first level optimization and searches multiple pattern sets with diverse sparsity for PP via reinforcement learning to support lightweight software reconfiguration, which corresponds to available frequency levels of DVFS (i. e., hardware reconfiguration).

AutoML

Paper
Add Code

Enabling Retrain-free Deep Neural Network Pruning using Surrogate Lagrangian Relaxation

no code implementations • 18 Dec 2020 • Deniz Gurevin, Shanglin Zhou, Lynn Pepin, Bingbing Li, Mikhail Bragin, Caiwen Ding, Fei Miao

We further accelerate the convergence of the SLR by using quadratic penalties.

Image Classification Lane Detection +4

Paper
Add Code

Efficient Transformer-based Large Scale Language Representations using Hardware-friendly Block Structured Pruning

no code implementations • Findings of the Association for Computational Linguistics 2020 • Bingbing Li, Zhenglun Kong, Tianyun Zhang, Ji Li, Zhengang Li, Hang Liu, Caiwen Ding

Pre-trained large-scale language models have increasingly demonstrated high accuracy on many natural language processing (NLP) tasks.

Edge-computing Knowledge Distillation

Paper
Add Code

Real-Time Execution of Large-scale Language Models on Mobile

no code implementations • 15 Sep 2020 • Wei Niu, Zhenglun Kong, Geng Yuan, Weiwen Jiang, Jiexiong Guan, Caiwen Ding, Pu Zhao, Sijia Liu, Bin Ren, Yanzhi Wang

Our framework can guarantee the identified model to meet both resource and real-time specifications of mobile devices, thus achieving real-time execution of large transformer-based models like BERT variants.

Edge-computing

Paper
Add Code

SAPAG: A Self-Adaptive Privacy Attack From Gradients

no code implementations • 14 Sep 2020 • Yijue Wang, Jieren Deng, Dan Guo, Chenghong Wang, Xianrui Meng, Hang Liu, Caiwen Ding, Sanguthevar Rajasekaran

Distributed learning such as federated learning or collaborative learning enables model training on decentralized data from users and only collects local gradients, where data is processed close to its sources for data privacy.

Federated Learning Reconstruction Attack

Paper
Add Code

ESMFL: Efficient and Secure Models for Federated Learning

no code implementations • 3 Sep 2020 • Sheng Lin, Chenghong Wang, Hongjia Li, Jieren Deng, Yanzhi Wang, Caiwen Ding

Nowadays, Deep Neural Networks are widely applied to various domains.

Federated Learning Privacy Preserving

Paper
Add Code

Against Membership Inference Attack: Pruning is All You Need

no code implementations • 28 Aug 2020 • Yijue Wang, Chenghong Wang, Zigeng Wang, Shanglin Zhou, Hang Liu, Jinbo Bi, Caiwen Ding, Sanguthevar Rajasekaran

The large model size, high computational operations, and vulnerability against membership inference attack (MIA) have impeded deep learning or deep neural networks (DNNs) popularity, especially on mobile devices.

Fraud Detection Inference Attack +2

Paper
Add Code

FTRANS: Energy-Efficient Acceleration of Transformers using FPGA

no code implementations • 16 Jul 2020 • Bingbing Li, Santosh Pandey, Haowen Fang, Yanjun Lyv, Ji Li, Jieyang Chen, Mimi Xie, Lipeng Wan, Hang Liu, Caiwen Ding

In natural language processing (NLP), the "Transformer" architecture was proposed as the first transduction model replying entirely on self-attention mechanisms without using sequence-aligned recurrent neural networks (RNNs) or convolution, and it achieved significant improvements for sequence to sequence tasks.

Model Compression

Paper
Add Code

A Unified DNN Weight Compression Framework Using Reweighted Optimization Methods

no code implementations • 12 Apr 2020 • Tianyun Zhang, Xiaolong Ma, Zheng Zhan, Shanglin Zhou, Minghai Qin, Fei Sun, Yen-Kuang Chen, Caiwen Ding, Makan Fardad, Yanzhi Wang

To address the large model size and intensive computation requirement of deep neural networks (DNNs), weight pruning techniques have been proposed and generally fall into two categories, i. e., static regularization-based pruning and dynamic regularization-based pruning.

Paper
Add Code

A Privacy-Preserving-Oriented DNN Pruning and Mobile Acceleration Framework

no code implementations • 13 Mar 2020 • Yifan Gong, Zheng Zhan, Zhengang Li, Wei Niu, Xiaolong Ma, Wenhao Wang, Bin Ren, Caiwen Ding, Xue Lin, Xiao-Lin Xu, Yanzhi Wang

Weight pruning of deep neural networks (DNNs) has been proposed to satisfy the limited storage and computing capability of mobile edge devices.

Model Compression Privacy Preserving

Paper
Add Code

A Multi-Agent Reinforcement Learning Approach For Safe and Efficient Behavior Planning Of Connected Autonomous Vehicles

no code implementations • 9 Mar 2020 • Songyang Han, Shanglin Zhou, Jiangwei Wang, Lynn Pepin, Caiwen Ding, Jie Fu, Fei Miao

The truncated Q-function utilizes the shared information from neighboring CAVs such that the joint state and action spaces of the Q-function do not grow in our algorithm for a large-scale CAV system.

Autonomous Vehicles Multi-agent Reinforcement Learning +1

Paper
Add Code

Towards an Efficient and General Framework of Robust Training for Graph Neural Networks

no code implementations • 25 Feb 2020 • Kaidi Xu, Sijia Liu, Pin-Yu Chen, Mengshu Sun, Caiwen Ding, Bhavya Kailkhura, Xue Lin

To overcome these limitations, we propose a general framework which leverages the greedy search algorithms and zeroth-order methods to obtain robust GNNs in a generic and an efficient manner.

Paper
Add Code

A SOT-MRAM-based Processing-In-Memory Engine for Highly Compressed DNN Implementation

no code implementations • 24 Nov 2019 • Geng Yuan, Xiaolong Ma, Sheng Lin, Zhengang Li, Caiwen Ding

Thus, the footprint and power consumption of SOT-MRAM PIM can be reduced, while increasing the overall system throughput at the meantime, making our proposed ADMM-based SOT-MRAM PIM more energy efficiency and suitable for embedded systems or IoT devices.

Model Compression Quantization

Paper
Add Code

Deep Compressed Pneumonia Detection for Low-Power Embedded Devices

no code implementations • 4 Nov 2019 • Hongjia Li, Sheng Lin, Ning Liu, Caiwen Ding, Yanzhi Wang

Deep neural networks (DNNs) have been expanded into medical fields and triggered the revolution of some medical applications by extracting complex features and achieving high accuracy and performance, etc.

Pneumonia Detection

Paper
Add Code

REQ-YOLO: A Resource-Aware, Efficient Quantization Framework for Object Detection on FPGAs

no code implementations • 29 Sep 2019 • Caiwen Ding, Shuo Wang, Ning Liu, Kaidi Xu, Yanzhi Wang, Yun Liang

To achieve real-time, highly-efficient implementations on FPGA, we present the detailed hardware implementation of block circulant matrices on CONV layers and develop an efficient processing element (PE) structure supporting the heterogeneous weight quantization, CONV dataflow and pipelining techniques, design optimization, and a template-based automatic synthesis framework to optimally exploit hardware resource.

Model Compression object-detection +2

Paper
Add Code

An Ultra-Efficient Memristor-Based DNN Framework with Structured Weight Pruning and Quantization Using ADMM

no code implementations • 29 Aug 2019 • Geng Yuan, Xiaolong Ma, Caiwen Ding, Sheng Lin, Tianyun Zhang, Zeinab S. Jalali, Yilong Zhao, Li Jiang, Sucheta Soundarajan, Yanzhi Wang

Memristor-based weight pruning and weight quantization have been seperately investigated and proven effectiveness in reducing area and power consumption compared to the original DNN model.

Quantization

Paper
Add Code

Tiny but Accurate: A Pruned, Quantized and Optimized Memristor Crossbar Framework for Ultra Efficient DNN Implementation

no code implementations • 27 Aug 2019 • Xiaolong Ma, Geng Yuan, Sheng Lin, Caiwen Ding, Fuxun Yu, Tao Liu, Wujie Wen, Xiang Chen, Yanzhi Wang

To mitigate the challenges, the memristor crossbar array has emerged as an intrinsically suitable matrix computation and low-power acceleration framework for DNN applications.

Model Compression Quantization

Paper
Add Code

A Stochastic-Computing based Deep Learning Framework using Adiabatic Quantum-Flux-Parametron SuperconductingTechnology

no code implementations • 22 Jul 2019 • Ruizhe Cai, Ao Ren, Olivia Chen, Ning Liu, Caiwen Ding, Xuehai Qian, Jie Han, Wenhui Luo, Nobuyuki Yoshikawa, Yanzhi Wang

Further, the application of SC has been investigated in DNNs in prior work, and the suitability has been illustrated as SC is more compatible with approximate computations.

Paper
Add Code

E-RNN: Design Optimization for Efficient Recurrent Neural Networks in FPGAs

no code implementations • 12 Dec 2018 • Zhe Li, Caiwen Ding, Siyue Wang, Wujie Wen, Youwei Zhuo, Chang Liu, Qinru Qiu, Wenyao Xu, Xue Lin, Xuehai Qian, Yanzhi Wang

It is a challenging task to have real-time, efficient, and accurate hardware RNN implementations because of the high sensitivity to imprecision accumulation and the requirement of special activation function implementations.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

Towards Budget-Driven Hardware Optimization for Deep Convolutional Neural Networks using Stochastic Computing

no code implementations • 10 May 2018 • Zhe Li, Ji Li, Ao Ren, Caiwen Ding, Jeffrey Draper, Qinru Qiu, Bo Yuan, Yanzhi Wang

Recently, Deep Convolutional Neural Network (DCNN) has achieved tremendous success in many machine learning applications.

Paper
Add Code

Learning Topics using Semantic Locality

no code implementations • 11 Apr 2018 • Ziyi Zhao, Krittaphat Pugdeethosapol, Sheng Lin, Zhe Li, Caiwen Ding, Yanzhi Wang, Qinru Qiu

The topic modeling discovers the latent topic probability of the given text documents.

Topic Models

Paper
Add Code

Structured Weight Matrices-Based Hardware Accelerators in Deep Neural Networks: FPGAs and ASICs

no code implementations • 28 Mar 2018 • Caiwen Ding, Ao Ren, Geng Yuan, Xiaolong Ma, Jiayu Li, Ning Liu, Bo Yuan, Yanzhi Wang

For FPGA implementations on deep convolutional neural networks (DCNNs), we achieve at least 152X and 72X improvement in performance and energy efficiency, respectively using the SWM-based framework, compared with the baseline of IBM TrueNorth processor under same accuracy constraints using the data set of MNIST, SVHN, and CIFAR-10.

Paper
Add Code

Efficient Recurrent Neural Networks using Structured Matrices in FPGAs

no code implementations • 20 Mar 2018 • Zhe Li, Shuo Wang, Caiwen Ding, Qinru Qiu, Yanzhi Wang, Yun Liang

Recurrent Neural Networks (RNNs) are becoming increasingly important for time series-related applications which require efficient and real-time implementations.

Model Compression Time Series +1

Paper
Add Code

C-LSTM: Enabling Efficient LSTM using Structured Compression Techniques on FPGAs

no code implementations • 14 Mar 2018 • Shuo Wang, Zhe Li, Caiwen Ding, Bo Yuan, Yanzhi Wang, Qinru Qiu, Yun Liang

The previous work proposes to use a pruning based compression technique to reduce the model size and thus speedups the inference on FPGAs.

Paper
Add Code

Towards Ultra-High Performance and Energy Efficiency of Deep Learning Systems: An Algorithm-Hardware Co-Optimization Framework

no code implementations • 18 Feb 2018 • Yanzhi Wang, Caiwen Ding, Zhe Li, Geng Yuan, Siyu Liao, Xiaolong Ma, Bo Yuan, Xuehai Qian, Jian Tang, Qinru Qiu, Xue Lin

Hardware accelerations of deep learning systems have been extensively investigated in industry and academia.

Paper
Add Code

VIBNN: Hardware Acceleration of Bayesian Neural Networks

no code implementations • 2 Feb 2018 • Ruizhe Cai, Ao Ren, Ning Liu, Caiwen Ding, Luhao Wang, Xuehai Qian, Massoud Pedram, Yanzhi Wang

In this paper, we propose VIBNN, an FPGA-based hardware accelerator design for variational inference on BNNs.

Small Data Image Classification Variational Inference

Paper
Add Code

FFT-Based Deep Learning Deployment in Embedded Systems

no code implementations • 13 Dec 2017 • Sheng Lin, Ning Liu, Mahdi Nazemi, Hongjia Li, Caiwen Ding, Yanzhi Wang, Massoud Pedram

The large model size of DNNs, while providing excellent accuracy, also burdens the embedded platforms with intensive computation and storage.

speech-recognition Speech Recognition

Paper
Add Code

CirCNN: Accelerating and Compressing Deep Neural Networks Using Block-CirculantWeight Matrices

no code implementations • 29 Aug 2017 • Caiwen Ding, Siyu Liao, Yanzhi Wang, Zhe Li, Ning Liu, Youwei Zhuo, Chao Wang, Xuehai Qian, Yu Bai, Geng Yuan, Xiaolong Ma, Yi-Peng Zhang, Jian Tang, Qinru Qiu, Xue Lin, Bo Yuan

As the size of DNNs continues to grow, it is critical to improve the energy efficiency and performance while maintaining accuracy.

Paper
Add Code

Hardware-Driven Nonlinear Activation for Stochastic Computing Based Deep Convolutional Neural Networks

no code implementations • 12 Mar 2017 • Ji Li, Zihao Yuan, Zhe Li, Caiwen Ding, Ao Ren, Qinru Qiu, Jeffrey Draper, Yanzhi Wang

Recently, Deep Convolutional Neural Networks (DCNNs) have made unprecedented progress, achieving the accuracy close to, or even better than human-level perception in various tasks.

Paper
Add Code

SC-DCNN: Highly-Scalable Deep Convolutional Neural Network using Stochastic Computing

no code implementations • 18 Nov 2016 • Ao Ren, Ji Li, Zhe Li, Caiwen Ding, Xuehai Qian, Qinru Qiu, Bo Yuan, Yanzhi Wang

Stochastic Computing (SC), which uses bit-stream to represent a number within [-1, 1] by counting the number of ones in the bit-stream, has a high potential for implementing DCNNs with high scalability and ultra-low hardware footprint.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.