no code implementations • 14 Feb 2024 • Junhan Kim, Kyungphil Park, Chungman Lee, Ho-young Kim, Joonyoung Kim, Yongkweon Jeon
Through extensive experiments on various language models and complexity analysis, we demonstrate that aespa is accurate and efficient in quantizing Transformer models.
1 code implementation • CVPR 2023 • Yongkweon Jeon, Chungman Lee, Ho-young Kim
We also propose a post-training quantization algorithm to enhance the performance of quantized models.