1 code implementation • 20 Oct 2023 • Mingwei Zhu, Leigang Sha, Yu Shu, Kangjia Zhao, Tiancheng Zhao, Jianwei Yin
Multimodal large language models (MLLMs) have shown great potential in perception and interpretation tasks, but their capabilities in predictive reasoning remain under-explored.
1 code implementation • 29 Sep 2023 • Guangyao Chen, Siwei Dong, Yu Shu, Ge Zhang, Jaward Sesay, Börje F. Karlsson, Jie Fu, Yemin Shi
Therefore, we introduce AutoAgents, an innovative framework that adaptively generates and coordinates multiple specialized agents to build an AI team according to different tasks.
1 code implementation • 30 Aug 2023 • Yu Shu, Siwei Dong, Guangyao Chen, Wenhao Huang, Ruihua Zhang, Daochen Shi, Qiqi Xiang, Yemin Shi
In this work, we propose Large Language and Speech Model (LLaSM).
2 code implementations • 17 Apr 2023 • Ge Zhang, Yemin Shi, Ruibo Liu, Ruibin Yuan, Yizhi Li, Siwei Dong, Yu Shu, Zhaoqun Li, Zekun Wang, Chenghua Lin, Wenhao Huang, Jie Fu
Instruction tuning is widely recognized as a key technique for building generalist language models, which has attracted the attention of researchers and the public with the release of InstructGPT~\citep{ouyang2022training} and ChatGPT\footnote{\url{https://chat. openai. com/}}.
no code implementations • 6 May 2019 • Yu Shu, Yemin Shi, Yao-Wei Wang, Tiejun Huang, Yonghong Tian
Predictors for new categories are added to the classification layer to "open" the deep neural networks to incorporate new categories dynamically.
no code implementations • 23 Jan 2019 • Yu Shu, Yemin Shi, Yao-Wei Wang, Yixiong Zou, Qingsheng Yuan, Yonghong Tian
Most of the existing action recognition works hold the \textit{closed-set} assumption that all action categories are known beforehand while deep networks can be well trained for these categories.