Search Results for author: Matthijs Douze

Found 46 papers, 28 papers with code

Vector search with small radiuses

no code implementations • 16 Mar 2024 • Gergely Szilvasy, Pierre-Emmanuel Mazaré, Matthijs Douze

Although convenient to compute, this metric is distantly related to the end-to-end accuracy of a full system that integrates vector search.

Image Retrieval Retrieval

Paper
Add Code

Watermarking Makes Language Models Radioactive

no code implementations • 22 Feb 2024 • Tom Sander, Pierre Fernandez, Alain Durmus, Matthijs Douze, Teddy Furon

This paper investigates the radioactivity of LLM-generated texts, i. e. whether it is possible to detect that such input was used as training data.

Paper
Add Code

Residual Quantization with Implicit Neural Codebooks

1 code implementation • 26 Jan 2024 • Iris Huijben, Matthijs Douze, Matthew Muckley, Ruud Van Sloun, Jakob Verbeek

In this paper, we propose QINCo, a neural RQ variant which predicts specialized codebooks per vector using a neural network that is conditioned on the approximation of the vector from previous steps.

Data Compression Quantization

Paper
Code

The Faiss library

1 code implementation • 16 Jan 2024 • Matthijs Douze, Alexandr Guzhva, Chengqi Deng, Jeff Johnson, Gergely Szilvasy, Pierre-Emmanuel Mazaré, Maria Lomeli, Lucas Hosseini, Hervé Jégou

The Faiss library is dedicated to vector similarity search, a core functionality of vector databases.

28,143

Paper
Code

Functional Invariants to Watermark Large Transformers

no code implementations • 17 Oct 2023 • Pierre Fernandez, Guillaume Couairon, Teddy Furon, Matthijs Douze

The rapid growth of transformer-based models increases the concerns about their integrity and ownership insurance.

Quantization

Paper
Add Code

DeDrift: Robust Similarity Search under Content Drift

no code implementations • ICCV 2023 • Dmitry Baranchuk, Matthijs Douze, Yash Upadhyay, I. Zeki Yalniz

We investigate the impact of this "content drift" for large-scale similarity search tools, based on nearest neighbor search in embedding space.

Paper
Add Code

The 2023 Video Similarity Dataset and Challenge

1 code implementation • 15 Jun 2023 • Ed Pizzi, Giorgos Kordopatis-Zilos, Hiral Patel, Gheorghe Postelnicu, Sugosh Nagavara Ravindra, Akshay Gupta, Symeon Papadopoulos, Giorgos Tolias, Matthijs Douze

The problem comprises two distinct but related tasks: determining whether a query video shares content with a reference video ("detection"), and additionally temporally localizing the shared content within each video ("localization").

Copy Detection Video Similarity

Paper
Code

The Stable Signature: Rooting Watermarks in Latent Diffusion Models

1 code implementation • ICCV 2023 • Pierre Fernandez, Guillaume Couairon, Hervé Jégou, Matthijs Douze, Teddy Furon

For instance, it detects the origin of an image generated from a text prompt, then cropped to keep $10\%$ of the content, with $90$+$\%$ accuracy at a false positive rate below 10$^{-6}$.

293

Paper
Code

Active Image Indexing

1 code implementation • 5 Oct 2022 • Pierre Fernandez, Matthijs Douze, Hervé Jégou, Teddy Furon

First, a neural network maps an image to a vector representation, that is relatively robust to various transformations of the image.

Copy Detection Quantization +1

Paper
Code

Results of the NeurIPS'21 Challenge on Billion-Scale Approximate Nearest Neighbor Search

no code implementations • 8 May 2022 • Harsha Vardhan Simhadri, George Williams, Martin Aumüller, Matthijs Douze, Artem Babenko, Dmitry Baranchuk, Qi Chen, Lucas Hosseini, Ravishankar Krishnaswamy, Gopal Srinivasa, Suhas Jayaram Subramanya, Jingdong Wang

The outcome of the competition was ranked leaderboards of algorithms in each track based on recall at a query throughput threshold.

Paper
Add Code

A Self-Supervised Descriptor for Image Copy Detection

1 code implementation • CVPR 2022 • Ed Pizzi, Sreya Dutta Roy, Sugosh Nagavara Ravindra, Priya Goyal, Matthijs Douze

We adapt this method to the copy detection task by changing the architecture and training objective, including a pooling operator from the instance matching literature, and adapting contrastive learning to augmentations that combine images.

Contrastive Learning Copy Detection +1

208

Paper
Code

Results and findings of the 2021 Image Similarity Challenge

no code implementations • 8 Feb 2022 • Zoë Papakipos, Giorgos Tolias, Tomas Jenicek, Ed Pizzi, Shuhei Yokoo, Wenhao Wang, Yifan Sun, Weipu Zhang, Yi Yang, Sanjay Addicam, Sergio Manuel Papadakis, Cristian Canton Ferrer, Ondrej Chum, Matthijs Douze

The 2021 Image Similarity Challenge introduced a dataset to serve as a new benchmark to evaluate recent image copy detection methods.

Copy Detection Self-Supervised Learning

Paper
Add Code

Nearest neighbor search with compact codes: A decoder perspective

no code implementations • 17 Dec 2021 • Kenza Amara, Matthijs Douze, Alexandre Sablayrolles, Hervé Jégou

Modern approaches for fast retrieval of similar vectors on billion-scaled datasets rely on compressed-domain approaches such as binary sketches or product quantization.

Quantization Retrieval

Paper
Add Code

Watermarking Images in Self-Supervised Latent Spaces

1 code implementation • 17 Dec 2021 • Pierre Fernandez, Alexandre Sablayrolles, Teddy Furon, Hervé Jégou, Matthijs Douze

We revisit watermarking techniques based on pre-trained deep networks, in the light of self-supervised approaches.

Data Augmentation

Paper
Code

Embedding Arithmetic of Multimodal Queries for Image Retrieval

no code implementations • 6 Dec 2021 • Guillaume Couairon, Matthieu Cord, Matthijs Douze, Holger Schwenk

We introduce the SIMAT dataset to evaluate the task of Image Retrieval with Multimodal queries.

Image Retrieval Image-text matching +3

Paper
Add Code

The 2021 Image Similarity Dataset and Challenge

1 code implementation • 17 Jun 2021 • Matthijs Douze, Giorgos Tolias, Ed Pizzi, Zoë Papakipos, Lowik Chanussot, Filip Radenovic, Tomas Jenicek, Maxim Maximov, Laura Leal-Taixé, Ismail Elezi, Ondřej Chum, Cristian Canton Ferrer

This benchmark is used for the Image Similarity Challenge at NeurIPS'21 (ISC2021).

Ranked #1 on Image Similarity Detection on DISC21 dev

Copy Detection Image Similarity Detection +1

192

Paper
Code

XCiT: Cross-Covariance Image Transformers

11 code implementations • NeurIPS 2021 • Alaaeldin El-Nouby, Hugo Touvron, Mathilde Caron, Piotr Bojanowski, Matthijs Douze, Armand Joulin, Ivan Laptev, Natalia Neverova, Gabriel Synnaeve, Jakob Verbeek, Hervé Jegou

We propose a "transposed" version of self-attention that operates across feature channels rather than tokens, where the interactions are based on the cross-covariance matrix between keys and queries.

Ranked #55 on Instance Segmentation on COCO minival

Instance Segmentation object-detection +3

29,735

Paper
Code

LeViT: a Vision Transformer in ConvNet's Clothing for Faster Inference

11 code implementations • ICCV 2021 • Ben Graham, Alaaeldin El-Nouby, Hugo Touvron, Pierre Stock, Armand Joulin, Hervé Jégou, Matthijs Douze

We design a family of image classification architectures that optimize the trade-off between accuracy and efficiency in a high-speed regime.

Ranked #11 on Image Classification on iNaturalist 2019

General Classification Image Classification

124,889

Paper
Code

Training data-efficient image transformers & distillation through attention

33 code implementations • 23 Dec 2020 • Hugo Touvron, Matthieu Cord, Matthijs Douze, Francisco Massa, Alexandre Sablayrolles, Hervé Jégou

In this work, we produce a competitive convolution-free transformer by training on Imagenet only.

Ranked #4 on Efficient ViTs on ImageNet-1K (with DeiT-S)

Document Image Classification Document Layout Analysis +2

124,889

Paper
Code

Grafit: Learning fine-grained image representations with coarse labels

no code implementations • ICCV 2021 • Hugo Touvron, Alexandre Sablayrolles, Matthijs Douze, Matthieu Cord, Hervé Jégou

By jointly leveraging the coarse labels and the underlying fine-grained latent space, it significantly improves the accuracy of category-level retrieval methods.

Ranked #2 on Learning with coarse labels on cifar100

Fine-Grained Image Classification Learning with coarse labels +3

Paper
Add Code

Powers of layers for image-to-image translation

no code implementations • 13 Aug 2020 • Hugo Touvron, Matthijs Douze, Matthieu Cord, Hervé Jégou

We propose a simple architecture to address unpaired image-to-image translation tasks: style or class transfer, denoising, deblurring, deblocking, etc.

Ranked #1 on Image-to-Image Translation on horse2zebra (Frechet Inception Distance metric)

Deblurring Denoising +2

Paper
Add Code

Generalized Few-Shot Video Classification with Video Retrieval and Feature Generation

1 code implementation • 9 Jul 2020 • Yongqin Xian, Bruno Korbar, Matthijs Douze, Lorenzo Torresani, Bernt Schiele, Zeynep Akata

Few-shot learning aims to recognize novel classes from a few examples.

Few-Shot Image Classification Few-Shot Learning +7

Paper
Code

Data Augmenting Contrastive Learning of Speech Representations in the Time Domain

1 code implementation • 2 Jul 2020 • Eugene Kharitonov, Morgane Rivière, Gabriel Synnaeve, Lior Wolf, Pierre-Emmanuel Mazaré, Matthijs Douze, Emmanuel Dupoux

Contrastive Predictive Coding (CPC), based on predicting future segments of speech based on past segments is emerging as a powerful algorithm for representation learning of speech signal.

Contrastive Learning Data Augmentation +1

626

Paper
Code

Fixing the train-test resolution discrepancy: FixEfficientNet

1 code implementation • 18 Mar 2020 • Hugo Touvron, Andrea Vedaldi, Matthijs Douze, Hervé Jégou

An EfficientNet-L2 pre-trained with weak supervision on 300M unlabeled images and further optimized with FixRes achieves 88. 5% top-1 accuracy (top-5: 98. 7%), which establishes the new state of the art for ImageNet with a single crop.

Ranked #9 on Image Classification on ImageNet ReaL (using extra training data)

Data Augmentation Image Classification

1,023

Paper
Code

Radioactive data: tracing through training

2 code implementations • ICML 2020 • Alexandre Sablayrolles, Matthijs Douze, Cordelia Schmid, Hervé Jégou

The mark is robust to strong variations such as different architectures or optimization methods.

Data Augmentation Data Poisoning

Paper
Code

Lead2Gold: Towards exploiting the full potential of noisy transcriptions for speech recognition

no code implementations • 16 Oct 2019 • Adrien Dufraux, Emmanuel Vincent, Awni Hannun, Armelle Brun, Matthijs Douze

The transcriptions used to train an Automatic Speech Recognition (ASR) system may contain errors.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

White-box vs Black-box: Bayes Optimal Strategies for Membership Inference

no code implementations • 29 Aug 2019 • Alexandre Sablayrolles, Matthijs Douze, Yann Ollivier, Cordelia Schmid, Hervé Jégou

Membership inference determines, given a sample and trained parameters of a machine learning model, whether the sample was part of the training set.

Paper
Add Code

Fixing the train-test resolution discrepancy

3 code implementations • NeurIPS 2019 • Hugo Touvron, Andrea Vedaldi, Matthijs Douze, Hervé Jégou

Conversely, when training a ResNeXt-101 32x48d pre-trained in weakly-supervised fashion on 940 million public images at resolution 224x224 and further optimizing for test resolution 320x320, we obtain a test top-1 accuracy of 86. 4% (top-5: 98. 0%) (single-crop).

Ranked #2 on Fine-Grained Image Classification on Birdsnap (using extra training data)

Data Augmentation Fine-Grained Image Classification +1

1,023

Paper
Code

MultiGrain: a unified image embedding for classes and instances

3 code implementations • 14 Feb 2019 • Maxim Berman, Hervé Jégou, Andrea Vedaldi, Iasonas Kokkinos, Matthijs Douze

When fed to a linear classifier, the learned embeddings provide state-of-the-art classification accuracy.

Ranked #1 on Image Retrieval on INRIA Holidays

Classification Data Augmentation +5

230

Paper
Code

Déjà Vu: an empirical evaluation of the memorization properties of ConvNets

no code implementations • ICLR 2019 • Alexandre Sablayrolles, Matthijs Douze, Cordelia Schmid, Hervé Jégou

Convolutional neural networks memorize part of their training data, which is why strategies such as data augmentation and drop-out are employed to mitigate overfitting.

Data Augmentation Memorization

Paper
Add Code

Deep Clustering for Unsupervised Learning of Visual Features

9 code implementations • ECCV 2018 • Mathilde Caron, Piotr Bojanowski, Armand Joulin, Matthijs Douze

In this work, we present DeepCluster, a clustering method that jointly learns the parameters of a neural network and the cluster assignments of the resulting features.

Ranked #1 on Image Clustering on CIFAR-100 (Train Set metric, using extra training data)

Clustering Deep Clustering +2

1,632

Paper
Code

Spreading vectors for similarity search

2 code implementations • ICLR 2019 • Alexandre Sablayrolles, Matthijs Douze, Cordelia Schmid, Hervé Jégou

Discretizing multi-dimensional data distributions is a fundamental step of modern indexing methods.

Quantization

317

Paper
Code

LAMV: Learning to Align and Match Videos With Kernelized Temporal Layers

1 code implementation • CVPR 2018 • Lorenzo Baraldi, Matthijs Douze, Rita Cucchiara, HervÃ© JÃ©gou

This paper considers a learnable approach for comparing and aligning videos.

Ranked #4 on Video Alignment on MSU Video Alignment and Retrieval Benchmark Suite

Copy Detection Retrieval +2

132

Paper
Code

Link and code: Fast indexing with graphs and compact regression codes

6 code implementations • CVPR 2018 • Matthijs Douze, Alexandre Sablayrolles, Hervé Jégou

Similarity search approaches based on graph walks have recently attained outstanding speed-accuracy trade-offs, taking aside the memory requirements.

Image Similarity Search Quantization +1

28,143

Paper
Code

An evaluation of large-scale methods for image instance and class discovery

no code implementations • 9 Aug 2017 • Matthijs Douze, Hervé Jégou, Jeff Johnson

While k-means is usually considered as the gold standard for this task, we evaluate and show the interest of diffusion methods that have been neglected by the state of the art, such as the Markov Clustering algorithm.

Clustering Instance Search

Paper
Add Code

Low-shot learning with large-scale diffusion

1 code implementation • CVPR 2018 • Matthijs Douze, Arthur Szlam, Bharath Hariharan, Hervé Jégou

This paper considers the problem of inferring image labels from images when only a few annotated examples are available at training time.

Ranked #6 on Few-Shot Image Classification on ImageNet-FS (1-shot, novel)

Few-Shot Image Classification graph construction

Paper
Code

Learning Joint Multilingual Sentence Representations with Neural Machine Translation

1 code implementation • WS 2017 • Holger Schwenk, Matthijs Douze

In this paper, we use the framework of neural machine translation to learn joint sentence representations across six very different languages.

Joint Multilingual Sentence Representations Machine Translation +2

396

Paper
Code

Billion-scale similarity search with GPUs

12 code implementations • 28 Feb 2017 • Jeff Johnson, Matthijs Douze, Hervé Jégou

Similarity search finds application in specialized database systems handling complex data such as images or videos, which are typically represented by high-dimensional features and require specific indexing structures.

Image Similarity Search Quantization

28,143

Paper
Code

FastText.zip: Compressing text classification models

43 code implementations • 12 Dec 2016 • Armand Joulin, Edouard Grave, Piotr Bojanowski, Matthijs Douze, Hérve Jégou, Tomas Mikolov

We consider the problem of producing compact architectures for text classification, such that the full model fits in a limited amount of memory.

General Classification Quantization +2

25,565

Paper
Code

How should we evaluate supervised hashing?

1 code implementation • 21 Sep 2016 • Alexandre Sablayrolles, Matthijs Douze, Hervé Jégou, Nicolas Usunier

Hashing produces compact representations for documents, to perform tasks like classification or retrieval based on these short codes.

General Classification Retrieval +1

Paper
Code

Polysemous codes

9 code implementations • 7 Sep 2016 • Matthijs Douze, Hervé Jégou, Florent Perronnin

This paper considers the problem of approximate nearest neighbor search in the compressed domain.

Quantization

28,143

Paper
Code

Convolutional Patch Representations for Image Retrieval: an Unsupervised Approach

no code implementations • 1 Mar 2016 • Mattis Paulin, Julien Mairal, Matthijs Douze, Zaid Harchaoui, Florent Perronnin, Cordelia Schmid

Convolutional neural networks (CNNs) have recently received a lot of attention due to their ability to model local stationary structures in natural images in a multi-scale fashion, when learning all model parameters with supervision.

Image Classification Image Retrieval +1

Paper
Add Code

Local Convolutional Features With Unsupervised Training for Image Retrieval

no code implementations • ICCV 2015 • Mattis Paulin, Matthijs Douze, Zaid Harchaoui, Julien Mairal, Florent Perronin, Cordelia Schmid

Patch-level descriptors underlie several important computer vision tasks, such as stereo-matching or content-based image retrieval.

Content-Based Image Retrieval Retrieval +2

Paper
Add Code

Beat-Event Detection in Action Movie Franchises

no code implementations • 15 Aug 2015 • Danila Potapov, Matthijs Douze, Jerome Revaud, Zaid Harchaoui, Cordelia Schmid

While important advances were recently made towards temporally localizing and recognizing specific human actions or activities in videos, efficient detection and classification of long video chunks belonging to semantically defined categories such as "pursuit" or "romance" remains challenging. We introduce a new dataset, Action Movie Franchises, consisting of a collection of Hollywood action movie franchises.

Classification Event Detection +1

Paper
Add Code

Circulant temporal encoding for video retrieval and temporal alignment

1 code implementation • 8 Jun 2015 • Matthijs Douze, Jérôme Revaud, Jakob Verbeek, Hervé Jégou, Cordelia Schmid

We address the problem of specific video event retrieval.

Retrieval Video Retrieval

132

Paper
Code

Event Retrieval in Large Video Collections with Circulant Temporal Encoding

no code implementations • CVPR 2013 • Jerome Revaud, Matthijs Douze, Cordelia Schmid, Herve Jegou

Furthermore, we extend product quantization to complex vectors in order to compress our descriptors, and to compare them in the compressed domain.

Copy Detection Quantization +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.