1 code implementation • 3 Nov 2023 • Yijia Zhang, Sicheng Zhang, Shijie Cao, Dayou Du, Jianyu Wei, Ting Cao, Ningyi Xu
Large language models (LLMs) show great performance in various tasks, but face deployment challenges from limited memory capacity and bandwidth.
1 code implementation • 23 Aug 2023 • Ranggi Hwang, Jianyu Wei, Shijie Cao, Changho Hwang, Xiaohu Tang, Ting Cao, Mao Yang
To tackle the high compute requirements of LLMs, the Mixture-of-Experts (MoE) architecture was introduced which is able to scale its model size without proportionally scaling up its computational requirements.