Search Results for author: Otniel-Bogdan Mercea

Found 7 papers, 6 papers with code

Time-, Memory- and Parameter-Efficient Visual Adaptation

no code implementations5 Feb 2024 Otniel-Bogdan Mercea, Alexey Gritsenko, Cordelia Schmid, Anurag Arnab

Here, we outperform a prior adaptor-based method which could only scale to a 1 billion parameter backbone, or fully-finetuning a smaller backbone, with the same GPU and less training time.

Video Classification

Video-adverb retrieval with compositional adverb-action embeddings

1 code implementation26 Sep 2023 Thomas Hummel, Otniel-Bogdan Mercea, A. Sophia Koepke, Zeynep Akata

We propose a framework for video-to-adverb retrieval (and vice versa) that aligns video embeddings with their matching compositional adverb-action text embedding in a joint embedding space.

Video-Adverb Retrieval (Unseen Compositions)

Text-to-feature diffusion for audio-visual few-shot learning

1 code implementation7 Sep 2023 Otniel-Bogdan Mercea, Thomas Hummel, A. Sophia Koepke, Zeynep Akata

Training deep learning models for video classification from audio-visual data commonly requires immense amounts of labeled training data collected via a costly process.

Classification Few-Shot Learning +1

Temporal and cross-modal attention for audio-visual zero-shot learning

2 code implementations20 Jul 2022 Otniel-Bogdan Mercea, Thomas Hummel, A. Sophia Koepke, Zeynep Akata

We show that our proposed framework that ingests temporal features yields state-of-the-art performance on the \ucf, \vgg, and \activity benchmarks for (generalised) zero-shot learning.

GZSL Video Classification Video Classification

Audio-visual Generalised Zero-shot Learning with Cross-modal Attention and Language

1 code implementation CVPR 2022 Otniel-Bogdan Mercea, Lukas Riesch, A. Sophia Koepke, Zeynep Akata

Focusing on the relatively underexplored task of audio-visual zero-shot learning, we propose to learn multi-modal representations from audio-visual data using cross-modal attention and exploit textual label embeddings for transferring knowledge from seen classes to unseen classes.

GZSL Video Classification ZSL Video Classification

Cannot find the paper you are looking for? You can Submit a new open access paper.