Search Results for author: Yana Wei

Found 2 papers, 0 papers with code

Merlin:Empowering Multimodal LLMs with Foresight Minds

no code implementations • 30 Nov 2023 • En Yu, Liang Zhao, Yana Wei, Jinrong Yang, Dongming Wu, Lingyu Kong, Haoran Wei, Tiancai Wang, Zheng Ge, Xiangyu Zhang, Wenbing Tao

Then, FIT requires MLLMs to first predict trajectories of related objects and then reason about potential future events based on them.

Ranked #65 on Visual Question Answering on MM-Vet

Visual Question Answering

Paper
Add Code

Grounded Image Text Matching with Mismatched Relation Reasoning

no code implementations • ICCV 2023 • Yu Wu, Yana Wei, Haozhe Wang, Yongfei Liu, Sibei Yang, Xuming He

This paper introduces Grounded Image Text Matching with Mismatched Relation (GITM-MR), a novel visual-linguistic joint task that evaluates the relation understanding capabilities of transformer-based pre-trained models.

Image-text matching Relation +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.