1 code implementation • 21 Dec 2023 • Shuzhe Wang, Vincent Leroy, Yohann Cabon, Boris Chidlovskii, Jerome Revaud
Our formulation directly provides a 3D model of the scene as well as depth information, but interestingly, we can seamlessly recover from it, pixel matches, relative and absolute camera.
no code implementations • 3 Oct 2023 • Jongmin Lee, Yohann Cabon, Romain Brégier, Sungjoo Yoo, Jerome Revaud
Existing learning-based methods for object pose estimation in RGB images are mostly model-specific or category based.
no code implementations • 21 Jul 2023 • Jerome Revaud, Yohann Cabon, Romain Brégier, Jongmin Lee, Philippe Weinzaepfel
Instead of encoding the scene coordinates into the network weights, our model takes as input a database image with some sparse 2D pixel to 3D coordinate annotations, extracted from e. g. off-the-shelf Structure-from-Motion or RGB-D data, and a query image for which are predicted a dense 3D coordinate map and its confidence, based on cross-attention.
1 code implementation • ICCV 2023 • Philippe Weinzaepfel, Thomas Lucas, Vincent Leroy, Yohann Cabon, Vaibhav Arora, Romain Brégier, Gabriela Csurka, Leonid Antsfeld, Boris Chidlovskii, Jérôme Revaud
Despite impressive performance for high-level downstream tasks, self-supervised pre-training methods have not yet fully delivered on dense geometric vision tasks such as stereo matching or optical flow.
Ranked #1 on Optical Flow Estimation on KITTI 2012
1 code implementation • 19 Oct 2022 • Philippe Weinzaepfel, Vincent Leroy, Thomas Lucas, Romain Brégier, Yohann Cabon, Vaibhav Arora, Leonid Antsfeld, Boris Chidlovskii, Gabriela Csurka, Jérôme Revaud
More precisely, we propose the pretext task of cross-view completion where the first input image is partially masked, and this masked content has to be reconstructed from the visible content and the second image.
1 code implementation • 31 May 2022 • Martin Humenberger, Yohann Cabon, Noé Pion, Philippe Weinzaepfel, Donghwan Lee, Nicolas Guérin, Torsten Sattler, Gabriela Csurka
In order to investigate the consequences for visual localization, this paper focuses on understanding the role of image retrieval for multiple visual localization paradigms.
no code implementations • CVPR 2021 • Donghwan Lee, Soohyun Ryu, Suyong Yeon, Yonghan Lee, Deokhwa Kim, Cheolho Han, Yohann Cabon, Philippe Weinzaepfel, Nicolas Guérin, Gabriela Csurka, Martin Humenberger
In this paper, we introduce 5 new indoor datasets for visual localization in challenging real-world environments.
1 code implementation • 24 Nov 2020 • Noé Pion, Martin Humenberger, Gabriela Csurka, Yohann Cabon, Torsten Sattler
This paper focuses on understanding the role of image retrieval for multiple visual localization tasks.
2 code implementations • 27 Jul 2020 • Martin Humenberger, Yohann Cabon, Nicolas Guerin, Julien Morat, Vincent Leroy, Jérôme Revaud, Philippe Rerole, Noé Pion, Cesar De Souza, Gabriela Csurka
To demonstrate this, we present a versatile pipeline for visual localization that facilitates the use of different local and global features, 3D data (e. g. depth maps), non-vision sensor data (e. g. IMU, GPS, WiFi), and various processing algorithms.
no code implementations • 29 Jan 2020 • Yohann Cabon, Naila Murray, Martin Humenberger
This paper introduces an updated version of the well-known Virtual KITTI dataset which consists of 5 sequence clones from the KITTI tracking benchmark.
no code implementations • 12 Oct 2019 • César Roberto de Souza, Adrien Gaidon, Yohann Cabon, Naila Murray, Antonio Manuel López
With this model we generate a diverse, realistic, and physically plausible dataset of human action videos, called PHAV for "Procedural Human Action Videos".
1 code implementation • 14 Jun 2019 • Jerome Revaud, Philippe Weinzaepfel, César De Souza, Noe Pion, Gabriela Csurka, Yohann Cabon, Martin Humenberger
In this work, we argue that salient regions are not necessarily discriminative, and therefore can harm the performance of the description.
no code implementations • CVPR 2017 • César Roberto de Souza, Adrien Gaidon, Yohann Cabon, Antonio Manuel López Peña
Deep learning for human action recognition in videos is making significant progress, but is slowed down by its dependency on expensive manual labeling of large video collections.
no code implementations • CVPR 2016 • Adrien Gaidon, Qiao Wang, Yohann Cabon, Eleonora Vig
We provide quantitative experimental evidence suggesting that (i) modern deep learning algorithms pre-trained on real data behave similarly in real and virtual worlds, and (ii) pre-training on virtual data improves performance.