Search Results for author: Jingwen Leng

Found 21 papers, 6 papers with code

Accelerating Generic Graph Neural Networks via Architecture, Compiler, Partition Method Co-Design

no code implementations16 Aug 2023 Shuwen Lu, Zhihui Zhang, Cong Guo, Jingwen Leng, Yangjie Zhou, Minyi Guo

However, designing GNN accelerators faces two fundamental challenges: the high bandwidth requirement of GNN models and the diversity of GNN models.

Graph Learning graph partitioning

AdaptGear: Accelerating GNN Training via Adaptive Subgraph-Level Kernels on GPUs

no code implementations27 May 2023 Yangjie Zhou, Yaoxu Song, Jingwen Leng, Zihan Liu, Weihao Cui, Zhendong Zhang, Cong Guo, Quan Chen, Li Li, Minyi Guo

Graph neural networks (GNNs) are powerful tools for exploring and learning from graph structures and features.

Fourier Transformer: Fast Long Range Modeling by Removing Sequence Redundancy with FFT Operator

1 code implementation24 May 2023 Ziwei He, Meng Yang, Minwei Feng, Jingcheng Yin, Xinbing Wang, Jingwen Leng, Zhouhan Lin

Many researchers have focused on designing new forms of self-attention or introducing new parameters to overcome this limitation, however a large portion of them prohibits the model to inherit weights from large pretrained models.

Abstractive Text Summarization Document Summarization +2

Nesting Forward Automatic Differentiation for Memory-Efficient Deep Neural Network Training

no code implementations22 Sep 2022 Cong Guo, Yuxian Qiu, Jingwen Leng, Chen Zhang, Ying Cao, Quanlu Zhang, Yunxin Liu, Fan Yang, Minyi Guo

An activation function is an element-wise mathematical function and plays a crucial role in deep neural networks (DNN).

ANT: Exploiting Adaptive Numerical Data Type for Low-bit Deep Neural Network Quantization

1 code implementation30 Aug 2022 Cong Guo, Chen Zhang, Jingwen Leng, Zihan Liu, Fan Yang, Yunxin Liu, Minyi Guo, Yuhao Zhu

In this work, we propose a fixed-length adaptive numerical data type called ANT to achieve low-bit quantization with tiny hardware overheads.

Quantization

SALO: An Efficient Spatial Accelerator Enabling Hybrid Sparse Attention Mechanisms for Long Sequences

no code implementations29 Jun 2022 Guan Shen, Jieru Zhao, Quan Chen, Jingwen Leng, Chao Li, Minyi Guo

However, the quadratic complexity of self-attention w. r. t the sequence length incurs heavy computational and memory burdens, especially for tasks with long sequences.

Transkimmer: Transformer Learns to Layer-wise Skim

1 code implementation ACL 2022 Yue Guan, Zhengyi Li, Jingwen Leng, Zhouhan Lin, Minyi Guo

To address the above limitations, we propose the Transkimmer architecture, which learns to identify hidden state tokens that are not required by each layer.

Computational Efficiency

SQuant: On-the-Fly Data-Free Quantization via Diagonal Hessian Approximation

1 code implementation ICLR 2022 Cong Guo, Yuxian Qiu, Jingwen Leng, Xiaotian Gao, Chen Zhang, Yunxin Liu, Fan Yang, Yuhao Zhu, Minyi Guo

This paper proposes an on-the-fly DFQ framework with sub-second quantization time, called SQuant, which can quantize networks on inference-only devices with low computation and memory requirements.

Data Free Quantization

Block-Skim: Efficient Question Answering for Transformer

1 code implementation16 Dec 2021 Yue Guan, Zhengyi Li, Jingwen Leng, Zhouhan Lin, Minyi Guo, Yuhao Zhu

We further prune the hidden states corresponding to the unnecessary positions early in lower layers, achieving significant inference-time speedup.

Extractive Question-Answering Question Answering

Dubhe: Towards Data Unbiasedness with Homomorphic Encryption in Federated Learning Client Selection

no code implementations8 Sep 2021 Shulai Zhang, Zirui Li, Quan Chen, Wenli Zheng, Jingwen Leng, Minyi Guo

Federated learning (FL) is a distributed machine learning paradigm that allows clients to collaboratively train a model over their own local data.

Federated Learning

Dual-side Sparse Tensor Core

no code implementations20 May 2021 Yang Wang, Chen Zhang, Zhiqiang Xie, Cong Guo, Yunxin Liu, Jingwen Leng

We demonstrate the feasibility of our design with minimal changes to the existing production-scale inner-product-based Tensor Core.

Block Skim Transformer for Efficient Question Answering

no code implementations1 Jan 2021 Yue Guan, Jingwen Leng, Yuhao Zhu, Minyi Guo

Following this idea, we proposed Block Skim Transformer (BST) to improve and accelerate the processing of transformer QA models.

Language Modelling Model Compression +1

How Far Does BERT Look At: Distance-based Clustering and Analysis of BERT's Attention

no code implementations COLING 2020 Yue Guan, Jingwen Leng, Chao Li, Quan Chen, Minyi Guo

Recent research on the multi-head attention mechanism, especially that in pre-trained models such as BERT, has shown us heuristics and clues in analyzing various aspects of the mechanism.

Clustering

How Far Does BERT Look At:Distance-based Clustering and Analysis of BERT$'$s Attention

no code implementations2 Nov 2020 Yue Guan, Jingwen Leng, Chao Li, Quan Chen, Minyi Guo

Recent research on the multi-head attention mechanism, especially that in pre-trained models such as BERT, has shown us heuristics and clues in analyzing various aspects of the mechanism.

Clustering

Architectural Implications of Graph Neural Networks

no code implementations2 Sep 2020 Zhihui Zhang, Jingwen Leng, Lingxiao Ma, Youshan Miao, Chao Li, Minyi Guo

Graph neural networks (GNN) represent an emerging line of deep learning models that operate on graph structures.

Balancing Efficiency and Flexibility for DNN Acceleration via Temporal GPU-Systolic Array Integration

no code implementations18 Feb 2020 Cong Guo, Yangjie Zhou, Jingwen Leng, Yuhao Zhu, Zidong Du, Quan Chen, Chao Li, Bin Yao, Minyi Guo

We propose Simultaneous Multi-mode Architecture (SMA), a novel architecture design and execution model that offers general-purpose programmability on DNN accelerators in order to accelerate end-to-end applications.

Adversarial Defense Through Network Profiling Based Path Extraction

no code implementations CVPR 2019 Yuxian Qiu, Jingwen Leng, Cong Guo, Quan Chen, Chao Li, Minyi Guo, Yuhao Zhu

Recently, researchers have started decomposing deep neural network models according to their semantics or functions.

Adversarial Defense

Effective Path: Know the Unknowns of Neural Network

no code implementations27 Sep 2018 Yuxian Qiu, Jingwen Leng, Yuhao Zhu, Quan Chen, Chao Li, Minyi Guo

Despite their enormous success, there is still no solid understanding of deep neural network’s working mechanism.

Cannot find the paper you are looking for? You can Submit a new open access paper.