1 code implementation • 12 Mar 2024 • Quzhe Huang, Zhenwei An, Nan Zhuang, Mingxu Tao, Chen Zhang, Yang Jin, Kun Xu, Liwei Chen, Songfang Huang, Yansong Feng
In this paper, we introduce a novel dynamic expert selection framework for Mixture of Experts (MoE) models, aiming to enhance computational efficiency and model performance by adjusting the number of activated experts based on input difficulty.
1 code implementation • 13 Nov 2023 • Hejing Cao, Zhenwei An, Jiazhan Feng, Kun Xu, Liwei Chen, Dongyan Zhao
While large language models exhibit remarkable performance in the Question Answering task, they are susceptible to hallucinations.
1 code implementation • 24 May 2023 • Quzhe Huang, Mingxu Tao, Chen Zhang, Zhenwei An, Cong Jiang, Zhibin Chen, Zirui Wu, Yansong Feng
Specifically, we inject domain knowledge during the continual training stage and teach the model to learn professional skills using properly designed supervised fine-tuning tasks.
1 code implementation • 31 Oct 2022 • Zhenwei An, Quzhe Huang, Cong Jiang, Yansong Feng, Dongyan Zhao
The charge prediction task aims to predict the charge for a case given its fact description.