Search Results for author: Guillaume Couairon

Found 13 papers, 7 papers with code

FreeSeg-Diff: Training-Free Open-Vocabulary Segmentation with Diffusion Models

no code implementations • 29 Mar 2024 • Barbara Toniella Corradini, Mustafa Shukor, Paul Couairon, Guillaume Couairon, Franco Scarselli, Matthieu Cord

The pipeline is as follows: the image is passed to both a captioner model (i. e. BLIP) and a diffusion model (i. e., Stable Diffusion Model) to generate a text description and visual representation, respectively.

Image Generation Image Segmentation +3

Paper
Add Code

Functional Invariants to Watermark Large Transformers

no code implementations • 17 Oct 2023 • Pierre Fernandez, Guillaume Couairon, Teddy Furon, Matthijs Douze

The rapid growth of transformer-based models increases the concerns about their integrity and ownership insurance.

Quantization

Paper
Add Code

Gradpaint: Gradient-Guided Inpainting with Diffusion Models

no code implementations • 18 Sep 2023 • Asya Grechka, Guillaume Couairon, Matthieu Cord

For the specific task of image inpainting, the current guiding mechanism relies on copying-and-pasting the known regions from the input image at each denoising step.

Denoising Image Inpainting +1

Paper
Add Code

Zero-shot spatial layout conditioning for text-to-image diffusion models

no code implementations • ICCV 2023 • Guillaume Couairon, Marlène Careil, Matthieu Cord, Stéphane Lathuilière, Jakob Verbeek

Large-scale text-to-image diffusion models have significantly improved the state of the art in generative image modelling and allow for an intuitive and powerful user interface to drive the image generation process.

Image Generation Segmentation +1

Paper
Add Code

Very high resolution canopy height maps from RGB imagery using self-supervised vision transformer and convolutional decoder trained on Aerial Lidar

no code implementations • 14 Apr 2023 • Jamie Tolan, Hung-I Yang, Ben Nosarzewski, Guillaume Couairon, Huy Vo, John Brandt, Justine Spore, Sayantan Majumdar, Daniel Haziza, Janaki Vamaraju, Theo Moutakanni, Piotr Bojanowski, Tracy Johns, Brian White, Tobias Tiecke, Camille Couprie

The maps are generated by the extraction of features from a self-supervised model trained on Maxar imagery from 2017 to 2020, and the training of a dense prediction decoder against aerial lidar maps.

Self-Supervised Learning

Paper
Add Code

The Stable Signature: Rooting Watermarks in Latent Diffusion Models

1 code implementation • ICCV 2023 • Pierre Fernandez, Guillaume Couairon, Hervé Jégou, Matthijs Douze, Teddy Furon

For instance, it detects the origin of an image generated from a text prompt, then cropped to keep $10\%$ of the content, with $90$+$\%$ accuracy at a false positive rate below 10$^{-6}$.

295

Paper
Code

DiffEdit: Diffusion-based semantic image editing with mask guidance

4 code implementations • 20 Oct 2022 • Guillaume Couairon, Jakob Verbeek, Holger Schwenk, Matthieu Cord

Semantic image editing is an extension of image generation, with the additional constraint that the generated image should be as similar as possible to a given input image.

Image Generation

Paper
Code

Efficient Vision-Language Pretraining with Visual Concepts and Hierarchical Alignment

1 code implementation • 29 Aug 2022 • Mustafa Shukor, Guillaume Couairon, Matthieu Cord

Vision and Language Pretraining has become the prevalent approach for tackling multimodal downstream tasks.

Retrieval Text Retrieval +4

Paper
Code

Transformer Decoders with MultiModal Regularization for Cross-Modal Food Retrieval

1 code implementation • 20 Apr 2022 • Mustafa Shukor, Guillaume Couairon, Asya Grechka, Matthieu Cord

We propose a new retrieval framework, T-Food (Transformer Decoders with MultiModal Regularization for Cross-Modal Food Retrieval) that exploits the interaction between modalities in a novel regularization scheme, while using only unimodal encoders at test time for efficient retrieval.

Ranked #3 on Cross-Modal Retrieval on Recipe1M

Cross-Modal Retrieval Retrieval

Paper
Code

FlexIT: Towards Flexible Semantic Image Translation

1 code implementation • CVPR 2022 • Guillaume Couairon, Asya Grechka, Jakob Verbeek, Holger Schwenk, Matthieu Cord

Via the latent space of an auto-encoder, we iteratively transform the input image toward the target point, ensuring coherence and quality with a variety of novel regularization terms.

Image Generation Translation

Paper
Code

FLAVA: A Foundational Language And Vision Alignment Model

3 code implementations • CVPR 2022 • Amanpreet Singh, Ronghang Hu, Vedanuj Goswami, Guillaume Couairon, Wojciech Galuba, Marcus Rohrbach, Douwe Kiela

State-of-the-art vision and vision-and-language models rely on large-scale visio-linguistic pretraining for obtaining good performance on a variety of downstream tasks.

Ranked #4 on Image Retrieval on MS COCO