Search Results for author: Pedro Morgado

Found 17 papers, 11 papers with code

Audio-Synchronized Visual Animation

no code implementations • 8 Mar 2024 • Lin Zhang, Shentong Mo, Yijing Zhang, Pedro Morgado

We hope our established benchmark can open new avenues for controllable visual generation.

Paper
Add Code

Unveiling the Power of Audio-Visual Early Fusion Transformers with Dense Interactions through Masked Modeling

1 code implementation • 2 Dec 2023 • Shentong Mo, Pedro Morgado

Thus, to address the computational complexity, we propose an alternative procedure that factorizes the local representations before representing audio-visual interactions.

Paper
Code

Why Is Prompt Tuning for Vision-Language Models Robust to Noisy Labels?

1 code implementation • ICCV 2023 • Cheng-En Wu, Yu Tian, Haichao Yu, Heng Wang, Pedro Morgado, Yu Hen Hu, Linjie Yang

Vision-language models such as CLIP learn a generic text-image embedding from large-scale training data.

Image Classification Language Modelling

Paper
Code

A Unified Audio-Visual Learning Framework for Localization, Separation, and Recognition

1 code implementation • 30 May 2023 • Shentong Mo, Pedro Morgado

The ability to accurately recognize, localize and separate sound sources is fundamental to any audio-visual perception task.

audio-visual learning

Paper
Code

Learning State-Aware Visual Representations from Audible Interactions

1 code implementation • 27 Sep 2022 • Himangi Mittal, Pedro Morgado, Unnat Jain, Abhinav Gupta

However, learning representations from videos can be challenging.

Ranked #1 on Long Term Action Anticipation on Ego4D (ED@20 Noun metric)

Action Anticipation Action Recognition +3

Paper
Code

A Closer Look at Weakly-Supervised Audio-Visual Source Localization

1 code implementation • 30 Aug 2022 • Shentong Mo, Pedro Morgado

We also propose a new approach for visual sound source localization that addresses both these problems.

Paper
Code

The Challenges of Continuous Self-Supervised Learning

no code implementations • 23 Mar 2022 • Senthil Purushwalkam, Pedro Morgado, Abhinav Gupta

As a result, SSL holds the promise to learn representations from data in-the-wild, i. e., without the need for finite and static datasets.

Representation Learning Self-Supervised Learning

Paper
Add Code

Localizing Visual Sounds the Easy Way

1 code implementation • 17 Mar 2022 • Shentong Mo, Pedro Morgado

Unsupervised audio-visual source localization aims at localizing visible sound sources in a video without relying on ground-truth localization for training.

Paper
Code

Robust Audio-Visual Instance Discrimination

no code implementations • CVPR 2021 • Pedro Morgado, Ishan Misra, Nuno Vasconcelos

Second, since self-supervised contrastive learning relies on random sampling of negative instances, instances that are semantically similar to the base instance can be used as faulty negatives.

Action Recognition Contrastive Learning +2

Paper
Add Code

Learning Representations from Audio-Visual Spatial Alignment

no code implementations • NeurIPS 2020 • Pedro Morgado, Yi Li, Nuno Vasconcelos

To learn from these spatial cues, we tasked a network to perform contrastive audio-visual spatial alignment of 360{\deg} video and spatial audio.

Action Recognition Representation Learning +2

Paper
Add Code

Deep Hashing with Hash-Consistent Large Margin Proxy Embeddings

no code implementations • 27 Jul 2020 • Pedro Morgado, Yunsheng Li, Jose Costa Pereira, Mohammad Saberian, Nuno Vasconcelos

The use of a fixed set of proxies (weights of the CNN classification layer) is proposed to eliminate this ambiguity, and a procedure to design proxy sets that are nearly optimal for both classification and hashing is introduced.

Binarization Classification +2

Paper
Add Code

Solving Long-tailed Recognition with Deep Realistic Taxonomic Classifier

1 code implementation • ECCV 2020 • Tz-Ying Wu, Pedro Morgado, Pei Wang, Chih-Hui Ho, Nuno Vasconcelos

Motivated by this, a deep realistic taxonomic classifier (Deep-RTC) is proposed as a new solution to the long-tail problem, combining realism with hierarchical predictions.

Paper
Code

Audio-Visual Instance Discrimination with Cross-Modal Agreement

1 code implementation • CVPR 2021 • Pedro Morgado, Nuno Vasconcelos, Ishan Misra

Our method uses contrastive learning for cross-modal discrimination of video from audio and vice-versa.

Ranked #3 on Self-Supervised Audio Classification on ESC-50

Audio Classification Contrastive Learning +3

124

Paper
Code

NetTailor: Tuning the Architecture, Not Just the Weights

1 code implementation • CVPR 2019 • Pedro Morgado, Nuno Vasconcelos

Under the standard paradigm of network fine-tuning, an entirely new CNN is learned per task, and the final network size is independent of task complexity.

Ranked #1 on Continual Learning on visual domain decathlon (10 tasks)

Continual Learning Object Recognition +2

Paper
Code

Self-Supervised Generation of Spatial Audio for 360° Video

no code implementations • NeurIPS 2018 • Pedro Morgado, Nuno Nvasconcelos, Timothy Langlois, Oliver Wang

We introduce an approach to convert mono audio recorded by a 360° video camera into spatial audio, a representation of the distribution of sound over the full viewing sphere.

Paper
Add Code

Self-Supervised Generation of Spatial Audio for 360 Video

1 code implementation • 7 Sep 2018 • Pedro Morgado, Nuno Vasconcelos, Timothy Langlois, Oliver Wang

Using our approach, we show that it is possible to infer the spatial location of sound sources based only on 360 video and a mono audio track.

Paper
Code

Semantically Consistent Regularization for Zero-Shot Recognition

1 code implementation • CVPR 2017 • Pedro Morgado, Nuno Vasconcelos

The role of semantics in zero-shot learning is considered.

General Classification Zero-Shot Learning

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.