Audio-visual Question Answering

11 papers with code • 1 benchmarks • 1 datasets

This task has no description! Would you like to contribute one?

Datasets


Most implemented papers

Answering Diverse Questions via Text Attached with Key Audio-Visual Clues

rikeilong/mcd-foravqa 11 Mar 2024

Audio-visual question answering (AVQA) requires reference to video content and auditory information, followed by correlating the question to predict the most precise answer.