no code implementations • 22 Apr 2024 • Thibault Formal, Stéphane Clinchant, Hervé Déjean, Carlos Lassance
The late interaction paradigm introduced with ColBERT stands out in the neural Information Retrieval space, offering a compelling effectiveness-efficiency trade-off across many benchmarks.
no code implementations • 20 Apr 2024 • Carlos Lassance, Hervé Dejean, Stéphane Clinchant, Nicola Tonellotto
Learned sparse models such as SPLADE have successfully shown how to incorporate the benefits of state-of-the-art neural information retrieval models into the classical inverted index data structure.
no code implementations • 15 Mar 2024 • Hervé Déjean, Stéphane Clinchant, Thibault Formal
We present a comparative study between cross-encoder and LLMs rerankers in the context of re-ranking effective SPLADE retrievers.
no code implementations • 11 Mar 2024 • Carlos Lassance, Hervé Déjean, Thibault Formal, Stéphane Clinchant
A companion to the release of the latest version of the SPLADE library.
1 code implementation • 5 Jun 2023 • Hervé Déjean, Stéphane Clinchant, Carlos Lassance, Simon Lupart, Thibault Formal
We compare both dense and sparse approaches under various finetuning protocols and middle training on different collections (MS MARCO, Wikipedia or Tripclick).
no code implementations • 25 Apr 2023 • Carlos Lassance, Stéphane Clinchant
This is why this paper aims to report the importance of this issue so that researchers can be made aware of this problem and appropriately report their results.
no code implementations • 25 Apr 2023 • Carlos Lassance, Simon Lupart, Hervé Dejean, Stéphane Clinchant, Nicola Tonellotto
Sparse neural retrievers, such as DeepImpact, uniCOIL and SPLADE, have been introduced recently as an efficient and effective way to perform retrieval with inverted indexes.
2 code implementations • 4 Apr 2023 • Jheng-Hong Yang, Carlos Lassance, Rafael Sampaio de Rezende, Krishna Srinivasan, Miriam Redi, Stéphane Clinchant, Jimmy Lin
This paper presents the AToMiC (Authoring Tools for Multimedia Content) dataset, designed to advance research in image/text cross-modal retrieval.
1 code implementation • 23 Mar 2023 • Vaishali Pal, Carlos Lassance, Hervé Déjean, Stéphane Clinchant
While previous studies have only experimented with dense retriever or in a cross lingual retrieval scenario, in this paper we aim to complete the picture on the use of adapters in IR.
no code implementations • 10 Mar 2023 • Carlos Lassance, Stéphane Clinchant
This paper describes our participation in the 2022 TREC NeuCLIR challenge.
no code implementations • 24 Feb 2023 • Carlos Lassance, Stéphane Clinchant
This paper describes our participation to the 2022 TREC Deep Learning challenge.
1 code implementation • 20 Feb 2023 • Guglielmo Faggioli, Thibault Formal, Stefano Marchesin, Stéphane Clinchant, Nicola Ferro, Benjamin Piwowarski
On top of that, in lexical-oriented scenarios, QPPs fail to predict performance for neural IR systems on those queries where they differ from traditional approaches the most.
no code implementations • 25 Jan 2023 • Carlos Lassance, Hervé Déjean, Stéphane Clinchant
In this paper, we study the impact of the pretraining collection on the final IR effectiveness.
no code implementations • 25 Jan 2023 • Simon Lupart, Stéphane Clinchant
Neural retrieval models have acquired significant effectiveness gains over the last few years compared to term-based methods.
1 code implementation • 8 Jul 2022 • Carlos Lassance, Stéphane Clinchant
SPLADE efficiency can be controlled via a regularization factor, but solely controlling this regularization has been shown to not be efficient enough.
1 code implementation • 10 May 2022 • Thibault Formal, Carlos Lassance, Benjamin Piwowarski, Stéphane Clinchant
Neural retrievers based on dense representations combined with Approximate Nearest Neighbors search have recently received a lot of attention, owing their success to distillation and/or better sampling of examples for training -- while still relying on the same backbone architecture.
no code implementations • 9 May 2022 • Hervé Déjean, Stéphane Clinchant, Jean-Luc Meunier
This paper investigates the Relation Extraction task in documents by benchmarking two different neural network models: a multi-modal language model (LayoutXLM) and a Graph Neural Network: Edge Convolution Network (ECN).
1 code implementation • 5 May 2022 • Simon Lupart, Thibault Formal, Stéphane Clinchant
To this end, we build three query-based distribution shifts within MS MARCO (query-semantic, query-intent, query-length), which are used to evaluate the three main families of neural retrievers based on BERT: sparse, dense, and late-interaction -- as well as a monoBERT re-ranker.
no code implementations • 20 Dec 2021 • Sarah Ibrahimi, Arnaud Sors, Rafael Sampaio de Rezende, Stéphane Clinchant
Learning with noisy labels is an active research area for image classification.
no code implementations • 13 Dec 2021 • Carlos Lassance, Maroua Maachou, Joohee Park, Stéphane Clinchant
Our experiments show that ColBERT indexes can be pruned up to 30\% on the MS MARCO passage collection without a significant drop in performance.
no code implementations • 10 Dec 2021 • Thibault Formal, Benjamin Piwowarski, Stéphane Clinchant
Neural Information Retrieval models hold the promise to replace lexical matching models, e. g. BM25, in modern search engines.
1 code implementation • 21 Sep 2021 • Thibault Formal, Carlos Lassance, Benjamin Piwowarski, Stéphane Clinchant
Meanwhile, there has been a growing interest in learning \emph{sparse} representations for documents and queries, that could inherit from the desirable properties of bag-of-words models such as the exact matching of terms and the efficiency of inverted indexes.
Ranked #5 on Zero-shot Text Search on BEIR
no code implementations • EMNLP 2021 • Alexandre Berard, Dain Lee, Stéphane Clinchant, Kweonwoo Jung, Vassilina Nikoulina
Multilingual NMT has become an attractive solution for MT deployment in production.
no code implementations • 1 Sep 2021 • Badr Youbi Idrissi, Stéphane Clinchant
Attacking Neural Machine Translation models is an inherently combinatorial task on discrete sequences, solved with approximate heuristics.
1 code implementation • 12 Jul 2021 • Thibault Formal, Benjamin Piwowarski, Stéphane Clinchant
In neural Information Retrieval, ongoing research is directed towards improving the first retriever in ranking pipelines.
no code implementations • 17 Dec 2020 • Thibault Formal, Benjamin Piwowarski, Stéphane Clinchant
Transformer-based models are nowadays state-of-the-art in ad-hoc Information Retrieval, but their behavior is far from being understood.
no code implementations • WS 2019 • Stéphane Clinchant, Kweon Woo Jung, Vassilina Nikoulina
Exploiting large pretrained models for various NMT tasks have gained a lot of visibility recently.
no code implementations • 14 Jun 2019 • Stéphane Clinchant, Hervé Déjean, Jean-Luc Meunier, Eva Lang, Florian Kleber
We present in this paper experiments on Table Recognition in hand-written registry books.