no code implementations • 14 Feb 2024 • Junhan Kim, Kyungphil Park, Chungman Lee, Ho-young Kim, Joonyoung Kim, Yongkweon Jeon
Through extensive experiments on various language models and complexity analysis, we demonstrate that aespa is accurate and efficient in quantizing Transformer models.
1 code implementation • CVPR 2023 • Yongkweon Jeon, Chungman Lee, Ho-young Kim
We also propose a post-training quantization algorithm to enhance the performance of quantized models.
no code implementations • CVPR 2022 • Yongkweon Jeon, Chungman Lee, Eulrang Cho, Yeonju Ro
We thus propose a new post-training non-uniform quantization method, called Mr. BiQ, allowing low bit-width quantization even on Transformer models.