1 code implementation • 2 Apr 2024 • Mengfei Du, Binhao Wu, Jiwen Zhang, Zhihao Fan, Zejun Li, Ruipu Luo, Xuanjing Huang, Zhongyu Wei
For task completion, the agent needs to align and integrate various navigation modalities, including instruction, observation and navigation history.
no code implementations • 16 Jul 2023 • Ruipu Luo, Jiwen Zhang, Zhongyu Wei
Vision language decision making (VLDM) is a challenging multimodal task.
1 code implementation • 12 Jun 2023 • Ruipu Luo, Ziwang Zhao, Min Yang, Junwei DOng, Da Li, Pengcheng Lu, Tao Wang, Linmei Hu, Minghui Qiu, Zhongyu Wei
Large language models (LLMs), with their remarkable conversational capabilities, have demonstrated impressive performance across various applications and have emerged as formidable AI assistants.