2 code implementations • 14 Dec 2023 • Xiaoxia Wu, Haojun Xia, Stephen Youn, Zhen Zheng, Shiyang Chen, Arash Bakhtiari, Michael Wyatt, Reza Yazdani Aminabadi, Yuxiong He, Olatunji Ruwase, Leon Song, Zhewei Yao
With our design, FP6 can become a promising solution to the current 4-bit quantization methods used in LLMs.