Search Results for author: Muhammad Haris Khan

Found 35 papers, 21 papers with code

Face-voice Association in Multilingual Environments (FAME) Challenge 2024 Evaluation Plan

1 code implementation • 14 Apr 2024 • Muhammad Saad Saeed, Shah Nawaz, Muhammad Salman Tahir, Rohan Kumar Das, Muhammad Zaigham Zaheer, Marta Moscati, Markus Schedl, Muhammad Haris Khan, Karthik Nandakumar, Muhammad Haroon Yousaf

The Face-voice Association in Multilingual Environments (FAME) Challenge 2024 focuses on exploring face-voice association under a unique condition of multilingual scenario.

Paper
Code

Pose-Guided Self-Training with Two-Stage Clustering for Unsupervised Landmark Discovery

1 code implementation • 24 Mar 2024 • Siddharth Tourani, Ahmed Alwheibi, Arif Mahmood, Muhammad Haris Khan

Second, motivated by the ZeroShot performance, we develop a ULD algorithm based on diffusion features using self-training and clustering which also outperforms prior methods by notable margins.

Clustering Self-Supervised Learning

Paper
Code

Towards Generalizing to Unseen Domains with Few Labels

1 code implementation • 18 Mar 2024 • Chamuditha Jayanga Galappaththige, Sanoojan Baliah, Malitha Gunawardhana, Muhammad Haris Khan

Existing domain generalization (DG) methods which are unable to exploit unlabeled data perform poorly compared to semi-supervised learning (SSL) methods under SSDG setting.

Domain Generalization Pseudo Label +1

Paper
Code

Why Not Use Your Textbook? Knowledge-Enhanced Procedure Planning of Instructional Videos

2 code implementations • 5 Mar 2024 • Kumaranage Ravindu Yasas Nagasinghe, Honglu Zhou, Malitha Gunawardhana, Martin Renqiang Min, Daniel Harari, Muhammad Haris Khan

This knowledge, sourced from training procedure plans and structured as a directed weighted graph, equips the agent to better navigate the complexities of step sequencing and its potential variations.

Logical Sequence Navigate

Paper
Code

Improving Pseudo-labelling and Enhancing Robustness for Semi-Supervised Domain Generalization

1 code implementation • 25 Jan 2024 • Adnan Khan, Mai A. Shaaban, Muhammad Haris Khan

A key challenge, faced by the best-performing SSL-based SSDG methods, is selecting accurate pseudo-labels under multiple domain shifts and reducing overfitting to source domains under limited labels.

Domain Generalization Semi-Supervised Domain Generalization

Paper
Code

Unified Spatio-Temporal Tri-Perspective View Representation for 3D Semantic Occupancy Prediction

no code implementations • 24 Jan 2024 • Sathira Silva, Savindu Bhashitha Wannigama, Gihan Jayatilaka, Muhammad Haris Khan, Roshan Ragel

Holistic understanding and reasoning in 3D scenes play a vital role in the success of autonomous driving systems.

3D Semantic Occupancy Prediction Autonomous Driving +1

Paper
Add Code

Domain Adaptive Object Detection via Balancing Between Self-Training and Adversarial Learning

no code implementations • 8 Nov 2023 • Muhammad Akhtar Munir, Muhammad Haris Khan, M. Saquib Sarfraz, Mohsen Ali

Deep learning based object detectors struggle generalizing to a new target domain bearing significant variations in object and background.

Object object-detection +1

Paper
Add Code

Cal-DETR: Calibrated Detection Transformer

1 code implementation • NeurIPS 2023 • Muhammad Akhtar Munir, Salman Khan, Muhammad Haris Khan, Mohsen Ali, Fahad Shahbaz Khan

Third, we develop a logit mixing approach that acts as a regularizer with detection-specific losses and is also complementary to the uncertainty-guided logit modulation technique to further improve the calibration performance.

Decision Making

Paper
Code

Generalizing to Unseen Domains in Diabetic Retinopathy Classification

1 code implementation • 26 Oct 2023 • Chamuditha Jayanga Galappaththige, Gayal Kuruppu, Muhammad Haris Khan

Therefore, automated diabetic retinopathy classification using deep learning techniques has gained interest in the medical imaging community.

Classification Domain Generalization

Paper
Code

Unsupervised Landmark Discovery Using Consistency Guided Bottleneck

1 code implementation • 19 Sep 2023 • Mamona Awan, Muhammad Haris Khan, Sanoojan Baliah, Muhammad Ahmad Waseem, Salman Khan, Fahad Shahbaz Khan, Arif Mahmood

In the current work, we introduce a consistency-guided bottleneck in an image reconstruction-based pipeline that leverages landmark consistency, a measure of compatibility score with the pseudo-ground truth to generate adaptive heatmaps.

Image Reconstruction

Paper
Code

Multiclass Alignment of Confidence and Certainty for Network Calibration

no code implementations • 6 Sep 2023 • Vinith Kugathasan, Muhammad Haris Khan

It is based on the observation that a model miscalibration is directly related to its predictive certainty, so a higher gap between the mean confidence and certainty amounts to a poor calibration both for in-distribution and out-of-distribution predictions.

Image Classification Medical Image Classification

Paper
Add Code

Exploring the Transfer Learning Capabilities of CLIP in Domain Generalization for Diabetic Retinopathy

1 code implementation • 27 Aug 2023 • Sanoojan Baliah, Fadillah A. Maani, Santosh Sanjeev, Muhammad Haris Khan

In this study, we investigate CLIP's transfer learning capabilities and its potential for cross-domain generalization in diabetic retinopathy (DR) classification.

Classification Domain Generalization +1

Paper
Code

Unsupervised Deep Graph Matching Based on Cycle Consistency

no code implementations • 18 Jul 2023 • Siddharth Tourani, Carsten Rother, Muhammad Haris Khan, Bogdan Savchynskyy

We contribute to the sparsely populated area of unsupervised deep graph matching with application to keypoint matching in images.

Graph Matching

Paper
Add Code

Multiclass Confidence and Localization Calibration for Object Detection

2 code implementations • CVPR 2023 • Bimsara Pathiraja, Malitha Gunawardhana, Muhammad Haris Khan

Surprisingly, very little to no attempts have been made in studying the calibration of object detection methods, which occupy a pivotal space in vision-based security-sensitive, and safety-critical applications.

Object object-detection +1

Paper
Code

Bridging Precision and Confidence: A Train-Time Loss for Calibrating Object Detection

1 code implementation • CVPR 2023 • Muhammad Akhtar Munir, Muhammad Haris Khan, Salman Khan, Fahad Shahbaz Khan

Since the original formulation of our loss depends on the counts of true positives and false positives in a minibatch, we develop a differentiable proxy of our loss that can be used during training with other application-specific loss functions.

object-detection Object Detection

Paper
Code

Single-branch Network for Multimodal Training

1 code implementation • 10 Mar 2023 • Muhammad Saad Saeed, Shah Nawaz, Muhammad Haris Khan, Muhammad Zaigham Zaheer, Karthik Nandakumar, Muhammad Haroon Yousaf, Arif Mahmood

With the rapid growth of social media platforms, users are sharing billions of multimedia posts containing audio, images, and text.

Cross-Modal Retrieval Retrieval

Paper
Code

MSI: Maximize Support-Set Information for Few-Shot Segmentation

1 code implementation • ICCV 2023 • Seonghyeon Moon, Samuel S. Sohn, Honglu Zhou, Sejong Yoon, Vladimir Pavlovic, Muhammad Haris Khan, Mubbasir Kapadia

To extract information relevant to the target class, a dominant approach in best-performing FSS methods removes background features using a support mask.

Ranked #3 on Few-Shot Semantic Segmentation on COCO-20i -> Pascal VOC (1-shot)

Few-Shot Semantic Segmentation

Paper
Code

Towards Improving Calibration in Object Detection Under Domain Shift

no code implementations • 15 Sep 2022 • Muhammad Akhtar Munir, Muhammad Haris Khan, M. Saquib Sarfraz, Mohsen Ali

To this end, we first propose a new, plug-and-play, train-time calibration loss for object detection (coined as TCD).

Decision Making Object +3

Paper
Add Code

Learning Branched Fusion and Orthogonal Projection for Face-Voice Association

1 code implementation • 22 Aug 2022 • Muhammad Saad Saeed, Shah Nawaz, Muhammad Haris Khan, Sajid Javed, Muhammad Haroon Yousaf, Alessio Del Bue

In addition, we leverage cross-modal verification and matching tasks to analyze the impact of multiple languages on face-voice association.

Metric Learning

Paper
Code

Self-Distilled Vision Transformer for Domain Generalization

2 code implementations • 25 Jul 2022 • Maryam Sultana, Muzammal Naseer, Muhammad Haris Khan, Salman Khan, Fahad Shahbaz Khan

Similar to CNNs, ViTs also struggle in out-of-distribution scenarios and the main culprit is overfitting to source domains.

Domain Generalization

Paper
Code

Video Instance Segmentation via Multi-scale Spatio-temporal Split Attention Transformer

1 code implementation • 24 Mar 2022 • Omkar Thawakar, Sanath Narayan, Jiale Cao, Hisham Cholakkal, Rao Muhammad Anwer, Muhammad Haris Khan, Salman Khan, Michael Felsberg, Fahad Shahbaz Khan

When using the ResNet50 backbone, our MS-STS achieves a mask AP of 50. 1 %, outperforming the best reported results in literature by 2. 7 % and by 4. 8 % at higher overlap threshold of AP_75, while being comparable in model size and speed on Youtube-VIS 2019 val.

Instance Segmentation Semantic Segmentation +2

Paper
Code

HM: Hybrid Masking for Few-Shot Segmentation

1 code implementation • 24 Mar 2022 • Seonghyeon Moon, Samuel S. Sohn, Honglu Zhou, Sejong Yoon, Vladimir Pavlovic, Muhammad Haris Khan, Mubbasir Kapadia

A fundamental limitation of FM is the inability to preserve the fine-grained spatial details that affect the accuracy of segmentation mask, especially for small target objects.

Ranked #5 on Few-Shot Semantic Segmentation on FSS-1000 (1-shot)

Few-Shot Semantic Segmentation Segmentation +1

Paper
Code

Generative Cooperative Learning for Unsupervised Video Anomaly Detection

no code implementations • CVPR 2022 • Muhammad Zaigham Zaheer, Arif Mahmood, Muhammad Haris Khan, Mattia Segu, Fisher Yu, Seung-Ik Lee

Video anomaly detection is well investigated in weakly-supervised and one-class classification (OCC) settings.

One-Class Classification Video Anomaly Detection

Paper
Add Code

Transformers in Medical Imaging: A Survey

1 code implementation • 24 Jan 2022 • Fahad Shamshad, Salman Khan, Syed Waqas Zamir, Muhammad Haris Khan, Munawar Hayat, Fahad Shahbaz Khan, Huazhu Fu

Following unprecedented success on the natural language tasks, Transformers have been successfully applied to several computer vision problems, achieving state-of-the-art results and prompting researchers to reconsider the supremacy of convolutional neural networks (CNNs) as {de facto} operators.

Image Classification Image Segmentation +6

1,104

Paper
Code

Fusion and Orthogonal Projection for Improved Face-Voice Association

2 code implementations • 20 Dec 2021 • Muhammad Saad Saeed, Muhammad Haris Khan, Shah Nawaz, Muhammad Haroon Yousaf, Alessio Del Bue

Prior works adopt pairwise or triplet loss formulations to learn an embedding space amenable for associated matching and verification tasks.

Cross-Modal Retrieval

Paper
Code

Visual Object Tracking with Discriminative Filters and Siamese Networks: A Survey and Outlook

no code implementations • 6 Dec 2021 • Sajid Javed, Martin Danelljan, Fahad Shahbaz Khan, Muhammad Haris Khan, Michael Felsberg, Jiri Matas

Accurate and robust visual object tracking is one of the most challenging and fundamental computer vision problems.

Visual Object Tracking Visual Tracking

Paper
Add Code

SSAL: Synergizing between Self-Training and Adversarial Learning for Domain Adaptive Object Detection

no code implementations • NeurIPS 2021 • Muhammad Akhtar Munir, Muhammad Haris Khan, M. Sarfraz, Mohsen Ali

In this paper, we propose to leverage model’s predictive uncertainty to strike the right balance between adversarial feature alignment and class-level alignment.

Object object-detection +1

Paper
Add Code

Synergizing between Self-Training and Adversarial Learning for Domain Adaptive Object Detection

no code implementations • 1 Oct 2021 • Muhammad Akhtar Munir, Muhammad Haris Khan, M. Saquib Sarfraz, Mohsen Ali

In this paper, we propose to leverage model predictive uncertainty to strike the right balance between adversarial feature alignment and class-level alignment.

Object object-detection +1

Paper
Add Code

Rich Semantics Improve Few-shot Learning

no code implementations • 26 Apr 2021 • Mohamed Afham, Salman Khan, Muhammad Haris Khan, Muzammal Naseer, Fahad Shahbaz Khan

Human learning benefits from multi-modal inputs that often appear as rich semantics (e. g., description of an object's attributes while learning about it).

Ranked #1 on Few-Shot Image Classification on Oxford 102 Flower (using extra training data)

Few-Shot Image Classification Few-Shot Learning

Paper
Add Code

Deep Contextual Attention for Human-Object Interaction Detection

no code implementations • ICCV 2019 • Tiancai Wang, Rao Muhammad Anwer, Muhammad Haris Khan, Fahad Shahbaz Khan, Yanwei Pang, Ling Shao, Jorma Laaksonen

Our approach outperforms the state-of-the-art on all datasets.

Human-Object Interaction Detection Object +3

Paper
Add Code

Mask-Guided Attention Network for Occluded Pedestrian Detection

1 code implementation • ICCV 2019 • Yanwei Pang, Jin Xie, Muhammad Haris Khan, Rao Muhammad Anwer, Fahad Shahbaz Khan, Ling Shao

Our approach obtains an absolute gain of 9. 5% in log-average miss rate, compared to the best reported results on the heavily occluded (HO) pedestrian set of CityPersons test set.

Pedestrian Detection

126

Paper
Code

AnimalWeb: A Large-Scale Hierarchical Dataset of Annotated Animal Faces

1 code implementation • CVPR 2020 • Muhammad Haris Khan, John McDonagh, Salman Khan, Muhammad Shahabuddin, Aditya Arora, Fahad Shahbaz Khan, Ling Shao, Georgios Tzimiropoulos

Several studies show that animal needs are often expressed through their faces.

Face Alignment Face Detection

Paper
Code

Pushing the boundaries of audiovisual word recognition using Residual Networks and LSTMs

no code implementations • 3 Nov 2018 • Themos Stafylakis, Muhammad Haris Khan, Georgios Tzimiropoulos

A further analysis on the utility of target word boundaries is provided, as well as on the capacity of the network in modeling the linguistic context of the target word.

Lipreading speech-recognition +1

Paper
Add Code

Synergy Between Face Alignment and Tracking via Discriminative Global Consensus Optimization

no code implementations • ICCV 2017 • Muhammad Haris Khan, John McDonagh, Georgios Tzimiropoulos

Tracking-by-detection is drift-free but results in low accuracy fittings.

Face Alignment Open-Ended Question Answering

Paper
Add Code

TRIC-track: Tracking by Regression With Incrementally Learned Cascades

no code implementations • ICCV 2015 • Xiaomeng Wang, Michel Valstar, Brais Martinez, Muhammad Haris Khan, Tony Pridmore

This paper proposes a novel approach to part-based tracking by replacing local matching of an appearance model by direct prediction of the displacement between local image patches and part locations.

Incremental Learning regression

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.