Search Results for author: Xingyu Zheng

Found 4 papers, 4 papers with code

How Good Are Low-bit Quantized LLaMA3 Models? An Empirical Study

1 code implementation22 Apr 2024 Wei Huang, Xudong Ma, Haotong Qin, Xingyu Zheng, Chengtao Lv, Hong Chen, Jie Luo, Xiaojuan Qi, Xianglong Liu, Michele Magno

This exploration holds the potential to unveil new insights and challenges for low-bit quantization of LLaMA3 and other forthcoming LLMs, especially in addressing performance degradation problems that suffer in LLM compression.

Language Modelling Large Language Model +1

BinaryDM: Towards Accurate Binarization of Diffusion Model

1 code implementation8 Apr 2024 Xingyu Zheng, Haotong Qin, Xudong Ma, Mingyuan Zhang, Haojie Hao, Jiakai Wang, Zixiang Zhao, Jinyang Guo, Xianglong Liu

With the advancement of diffusion models (DMs) and the substantially increased computational requirements, quantization emerges as a practical solution to obtain compact and efficient low-bit DMs.

Binarization Quantization

Accurate LoRA-Finetuning Quantization of LLMs via Information Retention

1 code implementation8 Feb 2024 Haotong Qin, Xudong Ma, Xingyu Zheng, Xiaoyang Li, Yang Zhang, Shouda Liu, Jie Luo, Xianglong Liu, Michele Magno

This paper proposes a novel IR-QLoRA for pushing quantized LLMs with LoRA to be highly accurate through information retention.

Quantization

Isolation and Induction: Training Robust Deep Neural Networks against Model Stealing Attacks

1 code implementation2 Aug 2023 Jun Guo, Aishan Liu, Xingyu Zheng, Siyuan Liang, Yisong Xiao, Yichao Wu, Xianglong Liu

However, these defenses are now suffering problems of high inference computational overheads and unfavorable trade-offs between benign accuracy and stealing robustness, which challenges the feasibility of deployed models in practice.

Cannot find the paper you are looking for? You can Submit a new open access paper.