Search Results for author: Aisha Urooj Khan

Found 7 papers, 5 papers with code

Learning Situation Hyper-Graphs for Video Question Answering

1 code implementation CVPR 2023 Aisha Urooj Khan, Hilde Kuehne, Bo Wu, Kim Chheu, Walid Bousselham, Chuang Gan, Niels Lobo, Mubarak Shah

The proposed method is trained in an end-to-end manner and optimized by a VQA loss with the cross-entropy function and a Hungarian matching loss for the situation graph prediction.

Ranked #6 on Video Question Answering on AGQA 2.0 balanced (Average Accuracy metric)

Question Answering Video Question Answering +1

Weakly Supervised Grounding for VQA in Vision-Language Transformers

1 code implementation5 Jul 2022 Aisha Urooj Khan, Hilde Kuehne, Chuang Gan, Niels da Vitoria Lobo, Mubarak Shah

Transformers for visual-language representation learning have been getting a lot of interest and shown tremendous performance on visual question answering (VQA) and grounding.

Question Answering Representation Learning +1

Found a Reason for me? Weakly-supervised Grounded Visual Question Answering using Capsules

1 code implementation CVPR 2021 Aisha Urooj Khan, Hilde Kuehne, Kevin Duarte, Chuang Gan, Niels Lobo, Mubarak Shah

In this paper, we focus on a more relaxed setting: the grounding of relevant visual entities in a weakly supervised manner by training on the VQA task alone.

Question Answering Visual Question Answering

MMFT-BERT: Multimodal Fusion Transformer with BERT Encodings for Visual Question Answering

1 code implementation Findings of the Association for Computational Linguistics 2020 Aisha Urooj Khan, Amir Mazaheri, Niels da Vitoria Lobo, Mubarak Shah

We present MMFT-BERT(MultiModal Fusion Transformer with BERT encodings), to solve Visual Question Answering (VQA) ensuring individual and combined processing of multiple input modalities.

Question Answering Visual Question Answering

Analysis of Hand Segmentation in the Wild

1 code implementation CVPR 2018 Aisha Urooj Khan, Ali Borji

In the quest for robust hand segmentation methods, we evaluated the performance of the state of the art semantic segmentation methods, off the shelf and fine-tuned, on existing datasets.

Fine-grained Action Recognition Hand Segmentation +3

Segmenting Sky Pixels in Images

no code implementations26 Dec 2017 Cecilia La Place, Aisha Urooj Khan, Ali Borji

As a result of our efforts, we have seen an improvement of 10-15% in the average MCR compared to the prior methods on SkyFinder dataset.

Scene Parsing

Egocentric Height Estimation

no code implementations9 Oct 2016 Jessica Finocchiaro, Aisha Urooj Khan, Ali Borji

We used both traditional computer vision approaches and deep learning in order to determine the visual cues that results in best height estimation.

Object Recognition Object Tracking +1

Cannot find the paper you are looking for? You can Submit a new open access paper.