no code implementations • 8 Feb 2024 • Zhikai Li, Xuewen Liu, Jing Zhang, Qingyi Gu
In particular, for the former, we introduce a learnable per-channel dual clipping scheme, which is designed to efficiently identify outliers in the unbalanced activations with fine granularity.
1 code implementation • 9 Jan 2024 • Xuewen Liu, Zhikai Li, Junrui Xiao, Qingyi Gu
Specifically, at the calibration sample level, we select calibration samples based on the density and diversity in the latent space, thus facilitating the alignment of their distribution with the overall samples; and at the reconstruction output level, we propose Fine-grained Block Reconstruction, which can align the outputs of the quantized model and the full-precision model at different network granularity.