1 code implementation • 9 Jan 2024 • Xuewen Liu, Zhikai Li, Junrui Xiao, Qingyi Gu
Specifically, at the calibration sample level, we select calibration samples based on the density and diversity in the latent space, thus facilitating the alignment of their distribution with the overall samples; and at the reconstruction output level, we propose Fine-grained Block Reconstruction, which can align the outputs of the quantized model and the full-precision model at different network granularity.
no code implementations • 24 May 2023 • Junrui Xiao, Zhikai Li, Lianwei Yang, Qingyi Gu
In this paper, we first argue empirically that the severe performance degradation is mainly caused by the weight oscillation in the binarization training and the information distortion in the activation of ViTs.
no code implementations • 11 May 2023 • Junrui Xiao, Zhikai Li, Lianwei Yang, Qingyi Gu
As emerging hardware begins to support mixed bit-width arithmetic computation, mixed-precision quantization is widely used to reduce the complexity of neural networks.
1 code implementation • ICCV 2023 • Zhikai Li, Junrui Xiao, Lianwei Yang, Qingyi Gu
Post-training quantization (PTQ), which only requires a tiny dataset for calibration without end-to-end retraining, is a light and practical model compression technique.
1 code implementation • 13 Sep 2022 • Zhikai Li, Mengjuan Chen, Junrui Xiao, Qingyi Gu
In this paper, we propose PSAQ-ViT V2, a more accurate and general data-free quantization framework for ViTs, built on top of PSAQ-ViT.
1 code implementation • 11 Apr 2022 • Jiayu Zou, Junrui Xiao, Zheng Zhu, JunJie Huang, Guan Huang, Dalong Du, Xingang Wang
In order to reap the benefits and avoid the drawbacks of CBFT and CFFT, we propose a novel framework with a Hybrid Feature Transformation module (HFT).
1 code implementation • 4 Mar 2022 • Zhikai Li, Liping Ma, Mengjuan Chen, Junrui Xiao, Qingyi Gu
The above insights guide us to design a relative value metric to optimize the Gaussian noise to approximate the real images, which are then utilized to calibrate the quantization parameters.