Search Results for author: Juan-Manuel Perez-Rua

Found 19 papers, 8 papers with code

Hyper-VolTran: Fast and Generalizable One-Shot Image to 3D Object Structure via HyperNetworks

no code implementations • 24 Dec 2023 • Christian Simon, Sen He, Juan-Manuel Perez-Rua, Mengmeng Xu, Amine Benhalloum, Tao Xiang

Solving image-to-3D from a single view is an ill-posed problem, and current neural reconstruction methods addressing it through diffusion models still rely on scene-specific optimization, constraining their generalization capability.

Image to 3D Neural Rendering

Paper
Add Code

GenTron: Delving Deep into Diffusion Transformers for Image and Video Generation

no code implementations • 7 Dec 2023 • Shoufa Chen, Mengmeng Xu, Jiawei Ren, Yuren Cong, Sen He, Yanping Xie, Animesh Sinha, Ping Luo, Tao Xiang, Juan-Manuel Perez-Rua

In this study, we explore Transformer-based diffusion models for image and video generation.

Text-to-Video Generation Video Generation

Paper
Add Code

FLATTEN: optical FLow-guided ATTENtion for consistent text-to-video editing

no code implementations • 9 Oct 2023 • Yuren Cong, Mengmeng Xu, Christian Simon, Shoufa Chen, Jiawei Ren, Yanping Xie, Juan-Manuel Perez-Rua, Bodo Rosenhahn, Tao Xiang, Sen He

In this paper, for the first time, we introduce optical flow into the attention module in the diffusion model's U-Net to address the inconsistency issue for text-to-video editing.

Optical Flow Estimation Text-to-Video Editing +1

Paper
Add Code

Multi-Modal Few-Shot Temporal Action Detection

1 code implementation • 27 Nov 2022 • Sauradip Nag, Mengmeng Xu, Xiatian Zhu, Juan-Manuel Perez-Rua, Bernard Ghanem, Yi-Zhe Song, Tao Xiang

In this work, we introduce a new multi-modality few-shot (MMFS) TAD problem, which can be considered as a marriage of FS-TAD and ZS-TAD by leveraging few-shot support videos and new class names jointly.

Action Detection Few-Shot Object Detection +3

Paper
Code

Where is my Wallet? Modeling Object Proposal Sets for Egocentric Visual Query Localization

1 code implementation • CVPR 2023 • Mengmeng Xu, Yanghao Li, Cheng-Yang Fu, Bernard Ghanem, Tao Xiang, Juan-Manuel Perez-Rua

Our experiments show the proposed adaptations improve egocentric query detection, leading to a better visual query localization system in both 2D and 3D configurations.

Object

Paper
Code

Negative Frames Matter in Egocentric Visual Query 2D Localization

1 code implementation • 3 Aug 2022 • Mengmeng Xu, Cheng-Yang Fu, Yanghao Li, Bernard Ghanem, Juan-Manuel Perez-Rua, Tao Xiang

The repeated gradient computation of the same object lead to an inefficient training; (2) The false positive rate is high on background frames.

Object

Paper
Code

SAIC_Cambridge-HuPBA-FBK Submission to the EPIC-Kitchens-100 Action Recognition Challenge 2021

no code implementations • 6 Oct 2021 • Swathikiran Sudhakaran, Adrian Bulat, Juan-Manuel Perez-Rua, Alex Falcon, Sergio Escalera, Oswald Lanz, Brais Martinez, Georgios Tzimiropoulos

This report presents the technical details of our submission to the EPIC-Kitchens-100 Action Recognition Challenge 2021.

Action Recognition Temporal Action Localization

Paper
Add Code

TNT: Text-Conditioned Network with Transductive Inference for Few-Shot Video Classification

2 code implementations • 21 Jun 2021 • Andrés Villa, Juan-Manuel Perez-Rua, Vladimir Araujo, Juan Carlos Niebles, Victor Escorcia, Alvaro Soto

Recently, few-shot learning has received increasing interest.

Action Classification Classification +3

Paper
Code

Space-time Mixing Attention for Video Transformer

1 code implementation • NeurIPS 2021 • Adrian Bulat, Juan-Manuel Perez-Rua, Swathikiran Sudhakaran, Brais Martinez, Georgios Tzimiropoulos

In this work, we propose a Video Transformer model the complexity of which scales linearly with the number of frames in the video sequence and hence induces no overhead compared to an image-based Transformer model.

Ranked #32 on Action Classification on Kinetics-600

Action Classification Action Recognition In Videos +1

Paper
Code

Low-Fidelity End-to-End Video Encoder Pre-training for Temporal Action Localization

no code implementations • 28 Mar 2021 • Mengmeng Xu, Juan-Manuel Perez-Rua, Xiatian Zhu, Bernard Ghanem, Brais Martinez

This results in a task discrepancy problem for the video encoder -- trained for action classification, but used for TAL.

Action Classification Model Optimization +3

Paper
Add Code

Few-shot Action Recognition with Prototype-centered Attentive Learning

1 code implementation • 20 Jan 2021 • Xiatian Zhu, Antoine Toisoul, Juan-Manuel Perez-Rua, Li Zhang, Brais Martinez, Tao Xiang

Extensive experiments on four standard few-shot action benchmarks show that our method clearly outperforms previous state-of-the-art methods, with the improvement particularly significant (10+\%) on the most challenging fine-grained action recognition benchmark.

Contrastive Learning Few-Shot action recognition +3

Paper
Code

Boundary-sensitive Pre-training for Temporal Localization in Videos

1 code implementation • ICCV 2021 • Mengmeng Xu, Juan-Manuel Perez-Rua, Victor Escorcia, Brais Martinez, Xiatian Zhu, Li Zhang, Bernard Ghanem, Tao Xiang

However, most existing models developed for these tasks are pre-trained on general video action classification tasks.

Ranked #23 on Temporal Action Localization on ActivityNet-1.3

Action Classification Classification +3

Paper
Code

Egocentric Action Recognition by Video Attention and Temporal Context

no code implementations • 3 Jul 2020 • Juan-Manuel Perez-Rua, Antoine Toisoul, Brais Martinez, Victor Escorcia, Li Zhang, Xiatian Zhu, Tao Xiang

In this challenge, action recognition is posed as the problem of simultaneously predicting a single `verb' and `noun' class label given an input trimmed video clip.

Action Recognition

Paper
Add Code

Knowing What, Where and When to Look: Efficient Video Action Modeling with Attention

no code implementations • 2 Apr 2020 • Juan-Manuel Perez-Rua, Brais Martinez, Xiatian Zhu, Antoine Toisoul, Victor Escorcia, Tao Xiang

Departing from existing alternatives, our W3 module models all three facets of video attention jointly.

Ranked #1 on Action Recognition on EgoGesture

Action Recognition

Paper
Add Code

Incremental Few-Shot Object Detection

no code implementations • CVPR 2020 • Juan-Manuel Perez-Rua, Xiatian Zhu, Timothy Hospedales, Tao Xiang

To this end we propose OpeN-ended Centre nEt (ONCE), a detector designed for incrementally learning to detect novel class objects with few examples.

Few-Shot Learning Few-Shot Object Detection +3

Paper
Add Code

MFAS: Multimodal Fusion Architecture Search

1 code implementation • CVPR 2019 • Juan-Manuel Perez-Rua, Valentin Vielzeuf, Stephane Pateux, Moez Baccouche, Frederic Jurie

We tackle the problem of finding good architectures for multimodal classification problems.

Action Recognition General Classification +2

Paper
Code

Efficient Progressive Neural Architecture Search

no code implementations • 1 Aug 2018 • Juan-Manuel Perez-Rua, Moez Baccouche, Stephane Pateux

We demonstrate with experiments on the CIFAR-10 dataset that our method, denominated Efficient progressive neural architecture search (EPNAS), leads to increased search efficiency, while retaining competitiveness of found architectures.

General Classification Image Classification +1

Paper
Add Code

Learning how to be robust: Deep polynomial regression

no code implementations • 17 Apr 2018 • Juan-Manuel Perez-Rua, Tomas Crivelli, Patrick Bouthemy, Patrick Perez

We bypass the need for a tailored loss function on the regression parameters by attaching to our model a differentiable hard-wired decoder corresponding to the polynomial operation at hand.

regression Video Stabilization

Paper
Add Code

Determining Occlusions From Space and Time Image Reconstructions

no code implementations • CVPR 2016 • Juan-Manuel Perez-Rua, Tomas Crivelli, Patrick Bouthemy, Patrick Perez

With this in mind, we propose a novel approach to occlusion detection where visibility or not of a point in next frame is formulated in terms of visual reconstruction.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.