Search Results for author: Aisha Urooj Khan

Found 7 papers, 5 papers with code

Learning Situation Hyper-Graphs for Video Question Answering

1 code implementation • CVPR 2023 • Aisha Urooj Khan, Hilde Kuehne, Bo Wu, Kim Chheu, Walid Bousselham, Chuang Gan, Niels Lobo, Mubarak Shah

The proposed method is trained in an end-to-end manner and optimized by a VQA loss with the cross-entropy function and a Hungarian matching loss for the situation graph prediction.

Ranked #6 on Video Question Answering on AGQA 2.0 balanced (Average Accuracy metric)

Question Answering Video Question Answering +1

Paper
Code

Weakly Supervised Grounding for VQA in Vision-Language Transformers

1 code implementation • 5 Jul 2022 • Aisha Urooj Khan, Hilde Kuehne, Chuang Gan, Niels da Vitoria Lobo, Mubarak Shah

Transformers for visual-language representation learning have been getting a lot of interest and shown tremendous performance on visual question answering (VQA) and grounding.

Question Answering Representation Learning +1

Paper
Code

Found a Reason for me? Weakly-supervised Grounded Visual Question Answering using Capsules

1 code implementation • CVPR 2021 • Aisha Urooj Khan, Hilde Kuehne, Kevin Duarte, Chuang Gan, Niels Lobo, Mubarak Shah

In this paper, we focus on a more relaxed setting: the grounding of relevant visual entities in a weakly supervised manner by training on the VQA task alone.

Question Answering Visual Question Answering

Paper
Code

MMFT-BERT: Multimodal Fusion Transformer with BERT Encodings for Visual Question Answering

1 code implementation • Findings of the Association for Computational Linguistics 2020 • Aisha Urooj Khan, Amir Mazaheri, Niels da Vitoria Lobo, Mubarak Shah

We present MMFT-BERT(MultiModal Fusion Transformer with BERT encodings), to solve Visual Question Answering (VQA) ensuring individual and combined processing of multiple input modalities.

Question Answering Visual Question Answering

Paper
Code

Analysis of Hand Segmentation in the Wild

1 code implementation • CVPR 2018 • Aisha Urooj Khan, Ali Borji

In the quest for robust hand segmentation methods, we evaluated the performance of the state of the art semantic segmentation methods, off the shelf and fine-tuned, on existing datasets.

Fine-grained Action Recognition Hand Segmentation +3