Search Results for author: Qun Gao

Found 2 papers, 2 papers with code

Efficient Post-training Quantization with FP8 Formats

2 code implementations26 Sep 2023 Haihao Shen, Naveen Mellempudi, Xin He, Qun Gao, Chang Wang, Mengni Wang

Recent advances in deep learning methods such as LLMs and Diffusion models have created a need for improved quantization methods that can meet the computational demands of these modern architectures while maintaining accuracy.

Image Classification Language Modelling +3

An Efficient Sparse Inference Software Accelerator for Transformer-based Language Models on CPUs

1 code implementation28 Jun 2023 Haihao Shen, Hengyu Meng, Bo Dong, Zhe Wang, Ofir Zafrir, Yi Ding, Yu Luo, Hanwen Chang, Qun Gao, Ziheng Wang, Guy Boudoukh, Moshe Wasserblat

We apply our sparse accelerator on widely-used Transformer-based language models including Bert-Mini, DistilBERT, Bert-Base, and BERT-Large.

Model Compression

Cannot find the paper you are looking for? You can Submit a new open access paper.