Search Results for author: Mingze Zhou

Found 2 papers, 2 papers with code

WorldGPT: Empowering LLM as Multimodal World Model

1 code implementation28 Apr 2024 Zhiqi Ge, Hongzhe Huang, Mingze Zhou, Juncheng Li, Guoming Wang, Siliang Tang, Yueting Zhuang

As for evaluation, we build WorldNet, a multimodal state transition prediction benchmark encompassing varied real-life scenarios.

Language Modelling Large Language Model

Cross-modal Prompts: Adapting Large Pre-trained Models for Audio-Visual Downstream Tasks

1 code implementation NeurIPS 2023 Haoyi Duan, Yan Xia, Mingze Zhou, Li Tang, Jieming Zhu, Zhou Zhao

This mechanism leverages audio and visual modalities as soft prompts to dynamically adjust the parameters of pre-trained models based on the current multi-modal input features.

Cannot find the paper you are looking for? You can Submit a new open access paper.