Search Results for author: Zhikai Li

Found 13 papers, 7 papers with code

ML2SC: Deploying Machine Learning Models as Smart Contracts on the Blockchain

no code implementations • 28 Mar 2024 • Zhikai Li, Steve Vott, Bhaskar Krishnamachar

Finally, the model inference can also be done with a function call providing the input.

Paper
Add Code

LLM Inference Unveiled: Survey and Roofline Model Insights

2 code implementations • 26 Feb 2024 • Zhihang Yuan, Yuzhang Shang, Yang Zhou, Zhen Dong, Zhe Zhou, Chenhao Xue, Bingzhe Wu, Zhikai Li, Qingyi Gu, Yong Jae Lee, Yan Yan, Beidi Chen, Guangyu Sun, Kurt Keutzer

Our survey stands out from traditional literature reviews by not only summarizing the current state of research but also by introducing a framework based on roofline model for systematic analysis of LLM inference techniques.

Knowledge Distillation Language Modelling +3

164

Paper
Code

RepQuant: Towards Accurate Post-Training Quantization of Large Transformer Models via Scale Reparameterization

no code implementations • 8 Feb 2024 • Zhikai Li, Xuewen Liu, Jing Zhang, Qingyi Gu

In particular, for the former, we introduce a learnable per-channel dual clipping scheme, which is designed to efficiently identify outliers in the unbalanced activations with fine granularity.

Quantization

Paper
Add Code

An Improved Grey Wolf Optimization Algorithm for Heart Disease Prediction

no code implementations • 22 Jan 2024 • Sihan Niu, Yifan Zhou, Zhikai Li, Shuyao Huang, Yujun Zhou

This paper presents a unique solution to challenges in medical image processing by incorporating an adaptive curve grey wolf optimization (ACGWO) algorithm into neural network backpropagation.

Disease Prediction

Paper
Add Code

RTA-Former: Reverse Transformer Attention for Polyp Segmentation

1 code implementation • 22 Jan 2024 • Zhikai Li, Murong Yi, Ali Uneri, Sihan Niu, Craig Jones

Polyp segmentation is a key aspect of colorectal cancer prevention, enabling early detection and guiding subsequent treatments.

Segmentation

Paper
Code

Enhanced Distribution Alignment for Post-Training Quantization of Diffusion Models

1 code implementation • 9 Jan 2024 • Xuewen Liu, Zhikai Li, Junrui Xiao, Qingyi Gu

Specifically, at the calibration sample level, we select calibration samples based on the density and diversity in the latent space, thus facilitating the alignment of their distribution with the overall samples; and at the reconstruction output level, we propose Fine-grained Block Reconstruction, which can align the outputs of the quantized model and the full-precision model at different network granularity.

Denoising Image Generation +2

Paper
Code

QFT: Quantized Full-parameter Tuning of LLMs with Affordable Resources

no code implementations • 11 Oct 2023 • Zhikai Li, Xiaoxuan Liu, Banghua Zhu, Zhen Dong, Qingyi Gu, Kurt Keutzer

Large Language Models (LLMs) have showcased remarkable impacts across a wide spectrum of natural language processing tasks.

Quantization

Paper
Add Code

BinaryViT: Towards Efficient and Accurate Binary Vision Transformers

no code implementations • 24 May 2023 • Junrui Xiao, Zhikai Li, Lianwei Yang, Qingyi Gu

In this paper, we first argue empirically that the severe performance degradation is mainly caused by the weight oscillation in the binarization training and the information distortion in the activation of ViTs.

Binarization Quantization

Paper
Add Code

Patch-wise Mixed-Precision Quantization of Vision Transformer

no code implementations • 11 May 2023 • Junrui Xiao, Zhikai Li, Lianwei Yang, Qingyi Gu

As emerging hardware begins to support mixed bit-width arithmetic computation, mixed-precision quantization is widely used to reduce the complexity of neural networks.

Quantization

Paper
Add Code

RepQ-ViT: Scale Reparameterization for Post-Training Quantization of Vision Transformers

1 code implementation • ICCV 2023 • Zhikai Li, Junrui Xiao, Lianwei Yang, Qingyi Gu

Post-training quantization (PTQ), which only requires a tiny dataset for calibration without end-to-end retraining, is a light and practical model compression technique.

Model Compression Quantization

Paper
Code

PSAQ-ViT V2: Towards Accurate and General Data-Free Quantization for Vision Transformers

1 code implementation • 13 Sep 2022 • Zhikai Li, Mengjuan Chen, Junrui Xiao, Qingyi Gu

In this paper, we propose PSAQ-ViT V2, a more accurate and general data-free quantization framework for ViTs, built on top of PSAQ-ViT.

Data Free Quantization Image Classification +4

120

Paper
Code

I-ViT: Integer-only Quantization for Efficient Vision Transformer Inference

1 code implementation • ICCV 2023 • Zhikai Li, Qingyi Gu

In this paper, we propose I-ViT, an integer-only quantization scheme for ViTs, to enable ViTs to perform the entire computational graph of inference with integer arithmetic and bit-shifting, and without any floating-point arithmetic.

Quantization

133

Paper
Code

Patch Similarity Aware Data-Free Quantization for Vision Transformers

1 code implementation • 4 Mar 2022 • Zhikai Li, Liping Ma, Mengjuan Chen, Junrui Xiao, Qingyi Gu

The above insights guide us to design a relative value metric to optimize the Gaussian noise to approximate the real images, which are then utilized to calibrate the quantization parameters.

Data Free Quantization

120

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.