1 code implementation • 29 Feb 2024 • Xupeng Miao, Gabriele Oliaro, Xinhao Cheng, Mengdi Wu, Colin Unger, Zhihao Jia
This is because existing systems cannot handle workloads that include a mix of inference and PEFT finetuning requests.
Language Modelling Large Language Model