Search Results for author: Yunfei Cheng

Found 1 papers, 0 papers with code

Recurrent Drafter for Fast Speculative Decoding in Large Language Models

no code implementations14 Mar 2024 Aonan Zhang, Chong Wang, Yi Wang, Xuanyu Zhang, Yunfei Cheng

In this paper, we introduce an improved approach of speculative decoding aimed at enhancing the efficiency of serving large language models.

Cannot find the paper you are looking for? You can Submit a new open access paper.