no code implementations • 25 Jan 2024 • Yao Fu, Leyang Xue, Yeqi Huang, Andrei-Octavian Brabete, Dmitrii Ustiugov, Yuvraj Patel, Luo Mai
This paper presents ServerlessLLM, a locality-enhanced serverless inference system for Large Language Models (LLMs).