Search Results for author: Zhengkai Lin

Found 1 papers, 0 papers with code

Model Compression and Efficient Inference for Large Language Models: A Survey

no code implementations15 Feb 2024 Wenxiao Wang, Wei Chen, Yicong Luo, Yongliu Long, Zhengkai Lin, Liye Zhang, Binbin Lin, Deng Cai, Xiaofei He

However, Large language models have two prominent characteristics compared to smaller models: (1) Most of compression algorithms require finetuning or even retraining the model after compression.

Knowledge Distillation Model Compression +1

Cannot find the paper you are looking for? You can Submit a new open access paper.