Search Results for author: Qunbo Wang

Found 3 papers, 1 papers with code

Boter: Bootstrapping Knowledge Selection and Question Answering for Knowledge-based VQA

no code implementations • 22 Apr 2024 • Dongze Hao, Qunbo Wang, Longteng Guo, Jie Jiang, Jing Liu

Knowledge-based Visual Question Answering (VQA) requires models to incorporate external knowledge to respond to questions about visual content.

Language Modelling Large Language Model +2

Paper
Add Code

Knowledge Condensation and Reasoning for Knowledge-based VQA

no code implementations • 15 Mar 2024 • Dongze Hao, Jian Jia, Longteng Guo, Qunbo Wang, Te Yang, Yan Li, Yanhua Cheng, Bo wang, Quan Chen, Han Li, Jing Liu

We condense the retrieved knowledge passages from two perspectives.

Question Answering Reading Comprehension +1

Paper
Add Code

VAST: A Vision-Audio-Subtitle-Text Omni-Modality Foundation Model and Dataset

1 code implementation • NeurIPS 2023 • Sihan Chen, Handong Li, Qunbo Wang, Zijia Zhao, Mingzhen Sun, Xinxin Zhu, Jing Liu

Based on the proposed VAST-27M dataset, we train an omni-modality video-text foundational model named VAST, which can perceive and process vision, audio, and subtitle modalities from video, and better support various tasks including vision-text, audio-text, and multi-modal video-text tasks (retrieval, captioning and QA).

Ranked #1 on Image Captioning on COCO Captions (SPICE metric, using extra training data)

Audio captioning Audio-Visual Captioning +14

182

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.