no code implementations • 20 Dec 2023 • Yunye Gong, Robik Shrestha, Jared Claypoole, Michael Cogswell, Arijit Ray, Christopher Kanan, Ajay Divakaran
We propose a novel VQA dataset, BloomVQA, to facilitate comprehensive evaluation of large vision-language models on comprehension tasks.
1 code implementation • 30 Nov 2023 • Dina Bashkirova, Arijit Ray, Rupayan Mallick, Sarah Adel Bargal, Jianming Zhang, Ranjay Krishna, Kate Saenko
Although generative editing methods now enable some forms of image editing, relighting is still beyond today's capabilities; existing methods struggle to keep other aspects of the image -- colors, shapes, and textures -- consistent after the edit.
no code implementations • 31 Aug 2023 • Katherine Deng, Arijit Ray, Reuben Tan, Saadia Gabriel, Bryan A. Plummer, Kate Saenko
We further see that current captioning metrics based on large vision-language models also fail to correlate with human preferences.
no code implementations • CVPR 2023 • Reuben Tan, Arijit Ray, Andrea Burns, Bryan A. Plummer, Justin Salamon, Oriol Nieto, Bryan Russell, Kate Saenko
We propose a self-supervised approach for learning to perform audio source separation in videos based on natural language queries, using only unlabeled video and audio pairs as training data.
no code implementations • 13 Oct 2021 • Kamran Alipour, Arijit Ray, Xiao Lin, Michael Cogswell, Jurgen P. Schulze, Yi Yao, Giedrius T. Burachas
In the domain of Visual Question Answering (VQA), studies have shown improvement in users' mental model of the VQA system when they are exposed to examples of how these systems answer certain Image-Question (IQ) pairs.
no code implementations • 26 Mar 2021 • Arijit Ray, Michael Cogswell, Xiao Lin, Kamran Alipour, Ajay Divakaran, Yi Yao, Giedrius Burachas
Hence, we propose Error Maps that clarify the error by highlighting image regions where the model is prone to err.
no code implementations • 2 Jul 2020 • Kamran Alipour, Arijit Ray, Xiao Lin, Jurgen P. Schulze, Yi Yao, Giedrius T. Burachas
In this paper, we evaluate the impact of explanations on the user's mental model of AI agent competency within the task of visual question answering (VQA).
no code implementations • IJCNLP 2019 • Arijit Ray, Karan Sikka, Ajay Divakaran, Stefan Lee, Giedrius Burachas
For instance, if a model answers "red" to "What color is the balloon?
no code implementations • 5 Apr 2019 • Arijit Ray, Yi Yao, Rakesh Kumar, Ajay Divakaran, Giedrius Burachas
Our experiments, therefore, demonstrate that ExAG is an effective means to evaluate the efficacy of AI-generated explanations on a human-AI collaborative task.
no code implementations • 15 Feb 2019 • Shalini Ghosh, Giedrius Burachas, Arijit Ray, Avi Ziskind
In this paper, we present a novel approach for the task of eXplainable Question Answering (XQA), i. e., generating natural language (NL) explanations for the Visual Question Answering (VQA) problem.
no code implementations • EMNLP 2016 • Arijit Ray, Gordon Christie, Mohit Bansal, Dhruv Batra, Devi Parikh
We introduce the novel problem of determining the relevance of questions to images in VQA.