Explanatory Visual Question Answering
3 papers with code • 1 benchmarks • 1 datasets
Explanatory Visual Question Answering (EVQA) requires answering visual questions and generating multimodal explanations for the reasoning processes.
Most implemented papers
Faithful Multimodal Explanation for Visual Question Answering
AI systems' ability to explain their reasoning is critical to their utility and trustworthiness.
REX: Reasoning-aware and Grounded Explanation
Finally, with our new data and method, we perform extensive analyses to study the effectiveness of our explanation under different settings, including multi-task learning and transfer learning.
Variational Causal Inference Network for Explanatory Visual Question Answering
To address these issues, we propose a Variational Causal Inference Network (VCIN) that establishes the causal correlation between predicted answers and explanations, and captures cross-modal relationships to generate rational explanations.