Search Results for author: Niluthpol Chowdhury Mithun

Found 14 papers, 5 papers with code

Unsupervised Domain Adaptation for Semantic Segmentation with Pseudo Label Self-Refinement

no code implementations • 25 Oct 2023 • Xingchen Zhao, Niluthpol Chowdhury Mithun, Abhinav Rajvanshi, Han-Pang Chiu, Supun Samarasekera

Recent state-of-the-art (SOTA) UDA methods employ a teacher-student self-training approach, where a teacher model is used to generate pseudo-labels for the new data which in turn guide the training process of the student model.

Pseudo Label Semantic Segmentation +1

Paper
Add Code

C-SFDA: A Curriculum Learning Aided Self-Training Framework for Efficient Source Free Domain Adaptation

no code implementations • CVPR 2023 • Nazmul Karim, Niluthpol Chowdhury Mithun, Abhinav Rajvanshi, Han-Pang Chiu, Supun Samarasekera, Nazanin Rahnavard

In this regard, source-free domain adaptation (SFDA) excels as access to source data is no longer required during adaptation.

Ranked #3 on Source-Free Domain Adaptation on VisDA-2017

Memorization Pseudo Label +3

Paper
Add Code

Cross-View Visual Geo-Localization for Outdoor Augmented Reality

no code implementations • 28 Mar 2023 • Niluthpol Chowdhury Mithun, Kshitij Minhas, Han-Pang Chiu, Taragay Oskiper, Mikhail Sizintsev, Supun Samarasekera, Rakesh Kumar

Precise estimation of global orientation and location is critical to ensure a compelling outdoor Augmented Reality (AR) experience.

Pose Estimation

Paper
Add Code

GraphMapper: Efficient Visual Navigation by Scene Graph Generation

no code implementations • 17 May 2022 • Zachary Seymour, Niluthpol Chowdhury Mithun, Han-Pang Chiu, Supun Samarasekera, Rakesh Kumar

Understanding the geometric relationships between objects in a scene is a core capability in enabling both humans and autonomous agents to navigate in new environments.

Graph Generation Navigate +2

Paper
Add Code

SASRA: Semantically-aware Spatio-temporal Reasoning Agent for Vision-and-Language Navigation in Continuous Environments

1 code implementation • 26 Aug 2021 • Muhammad Zubair Irshad, Niluthpol Chowdhury Mithun, Zachary Seymour, Han-Pang Chiu, Supun Samarasekera, Rakesh Kumar

This paper presents a novel approach for the Vision-and-Language Navigation (VLN) task in continuous 3D environments, which requires an autonomous agent to follow natural language instructions in unseen environments.

Vision and Language Navigation

Paper
Code

Recall Loss for Imbalanced Image Classification and Semantic Segmentation

1 code implementation • 1 Jan 2021 • Junjiao Tian, Niluthpol Chowdhury Mithun, Zachary Seymour, Han-Pang Chiu, Zsolt Kira

Many works have proposed to weigh the standard cross entropy loss function with pre-computed weights based on class statistics such as the number of samples and class margins.

Classification General Classification +4

Paper
Code

RGB2LIDAR: Towards Solving Large-Scale Cross-Modal Visual Localization

1 code implementation • 12 Sep 2020 • Niluthpol Chowdhury Mithun, Karan Sikka, Han-Pang Chiu, Supun Samarasekera, Rakesh Kumar

To enable large-scale evaluation, we introduce a new dataset containing over 550K pairs (covering 143 km^2 area) of RGB and aerial LIDAR depth images.

Visual Localization

Paper
Code

Text-based Localization of Moments in a Video Corpus

no code implementations • 20 Aug 2020 • Sudipta Paul, Niluthpol Chowdhury Mithun, Amit K. Roy-Chowdhury

This task poses a unique challenge as the system is required to perform: (i) retrieval of the relevant video where only a segment of the video corresponds with the queried sentence, and (ii) temporal localization of moment in the relevant video based on sentence query.

Moment Retrieval Retrieval +2

Paper
Add Code

A Skip Connection Architecture for Localization of Image Manipulations

no code implementations • IEEE Conference on Computer Vision and Pattern Recognition (CVPR) Workshops 2019 • Ghazal Mazaheri, Niluthpol Chowdhury Mithun, Jawadul H. Bappy, Amit K. Roy-Chowdhury

In order to exploit these traces in localizing the tampered regions, we propose an encoder-decoder based network where we fuse representations from early layers in the encoder (which are richer in low-level spatial cues, like edges) by skip pooling with representations of the last layer of the decoder and use for manipulation detection.

Decoder Image Manipulation +1

Paper
Add Code

Weakly Supervised Video Moment Retrieval From Text Queries

1 code implementation • CVPR 2019 • Niluthpol Chowdhury Mithun, Sujoy Paul, Amit K. Roy-Chowdhury

The weak nature of the supervision is because, during training, we only have access to the video-text pairs rather than the temporal extent of the video to which different text descriptions relate.

Moment Retrieval Natural Language Queries +2

Paper
Code

Webly Supervised Joint Embedding for Cross-Modal lmage-Text Retrieval

no code implementations • Proceedings of the 26th ACM international conference on Multimedia·October 2018 2018 • Niluthpol Chowdhury Mithun, Rameswar Panda, Vagelis Papalexakis, Amit K. Roy-Chowdhury

Inspired by the recent success of web-supervised learning in deep neural networks, we capitalize on readily-available web images with noisy annotations to learn robust image-text joint representation.

Cross-Modal Retrieval Retrieval +1

Paper
Add Code

Webly Supervised Joint Embedding for Cross-Modal Image-Text Retrieval

no code implementations • 23 Aug 2018 • Niluthpol Chowdhury Mithun, Rameswar Panda, Evangelos E. Papalexakis, Amit K. Roy-Chowdhury

Inspired by the recent success of webly supervised learning in deep neural networks, we capitalize on readily-available web images with noisy annotations to learn robust image-text joint representation.

Cross-Modal Retrieval Retrieval +1

Paper
Add Code

Learning Joint Embedding with Multimodal Cues for Cross-Modal Video-Text Retrieval

1 code implementation • ICMR 2018 • Niluthpol Chowdhury Mithun, Juncheng Li, Florian Metze, Amit K. Roy-Chowdhury

Constructing a joint representation invariant across different modalities (e. g., video, language) is of significant importance in many multimedia applications.

Ranked #37 on Video Retrieval on MSR-VTT

Retrieval Text Retrieval +1

Paper
Code

Diversity-aware Multi-Video Summarization

no code implementations • 9 Jun 2017 • Rameswar Panda, Niluthpol Chowdhury Mithun, Amit K. Roy-Chowdhury

Most video summarization approaches have focused on extracting a summary from a single video; we propose an unsupervised framework for summarizing a collection of videos.

Video Summarization

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.