1 code implementation • 11 Mar 2024 • Qilang Ye, Zitong Yu, Xin Liu
Audio-visual question answering (AVQA) requires reference to video content and auditory information, followed by correlating the question to predict the most precise answer.
Audio-visual Question Answering Audio-Visual Question Answering (AVQA) +3
1 code implementation • 7 Mar 2024 • Qilang Ye, Zitong Yu, Rui Shao, Xinyu Xie, Philip Torr, Xiaochun Cao
This paper focuses on the challenge of answering questions in scenarios that are composed of rich and complex dynamic audio-visual components.
Audio-visual Question Answering Audio-Visual Question Answering (AVQA) +5