Search Results for author: Zhengkai Lin

Found 1 papers, 0 papers with code

Model Compression and Efficient Inference for Large Language Models: A Survey

no code implementations • 15 Feb 2024 • Wenxiao Wang, Wei Chen, Yicong Luo, Yongliu Long, Zhengkai Lin, Liye Zhang, Binbin Lin, Deng Cai, Xiaofei He

However, Large language models have two prominent characteristics compared to smaller models: (1) Most of compression algorithms require finetuning or even retraining the model after compression.

Knowledge Distillation Model Compression +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.