Search Results for author: Thibault Formal

Found 13 papers, 6 papers with code

SPLATE: Sparse Late Interaction Retrieval

no code implementations • 22 Apr 2024 • Thibault Formal, Stéphane Clinchant, Hervé Déjean, Carlos Lassance

The late interaction paradigm introduced with ColBERT stands out in the neural Information Retrieval space, offering a compelling effectiveness-efficiency trade-off across many benchmarks.

Information Retrieval Re-Ranking +1

Paper
Add Code

A Thorough Comparison of Cross-Encoders and LLMs for Reranking SPLADE

no code implementations • 15 Mar 2024 • Hervé Déjean, Stéphane Clinchant, Thibault Formal

We present a comparative study between cross-encoder and LLMs rerankers in the context of re-ranking effective SPLADE retrievers.

Re-Ranking

Paper
Add Code

SPLADE-v3: New baselines for SPLADE

no code implementations • 11 Mar 2024 • Carlos Lassance, Hervé Déjean, Thibault Formal, Stéphane Clinchant

A companion to the release of the latest version of the SPLADE library.

Paper
Add Code

Benchmarking Middle-Trained Language Models for Neural Search

1 code implementation • 5 Jun 2023 • Hervé Déjean, Stéphane Clinchant, Carlos Lassance, Simon Lupart, Thibault Formal

We compare both dense and sparse approaches under various finetuning protocols and middle training on different collections (MS MARCO, Wikipedia or Tripclick).

Benchmarking Language Modelling +1

641

Paper
Code

Query Performance Prediction for Neural IR: Are We There Yet?

1 code implementation • 20 Feb 2023 • Guglielmo Faggioli, Thibault Formal, Stefano Marchesin, Stéphane Clinchant, Nicola Ferro, Benjamin Piwowarski

On top of that, in lexical-oriented scenarios, QPPs fail to predict performance for neural IR systems on those queries where they differ from traditional approaches the most.

Passage Retrieval Retrieval

Paper
Code

CoSPLADE: Contextualizing SPLADE for Conversational Information Retrieval

no code implementations • 11 Jan 2023 • Nam Le Hai, Thomas Gerald, Thibault Formal, Jian-Yun Nie, Benjamin Piwowarski, Laure Soulier

Conversational search is a difficult task as it aims at retrieving documents based not only on the current user query but also on the full conversation history.

Conversational Search Information Retrieval +2

Paper
Add Code

From Distillation to Hard Negative Sampling: Making Sparse Neural IR Models More Effective

1 code implementation • 10 May 2022 • Thibault Formal, Carlos Lassance, Benjamin Piwowarski, Stéphane Clinchant

Neural retrievers based on dense representations combined with Approximate Nearest Neighbors search have recently received a lot of attention, owing their success to distillation and/or better sampling of examples for training -- while still relying on the same backbone architecture.

Language Modelling Representation Learning

641

Paper
Code

MS-Shift: An Analysis of MS MARCO Distribution Shifts on Neural Retrieval

1 code implementation • 5 May 2022 • Simon Lupart, Thibault Formal, Stéphane Clinchant

To this end, we build three query-based distribution shifts within MS MARCO (query-semantic, query-intent, query-length), which are used to evaluate the three main families of neural retrievers based on BERT: sparse, dense, and late-interaction -- as well as a monoBERT re-ranker.

Information Retrieval Retrieval

Paper
Code

Composite Code Sparse Autoencoders for first stage retrieval

no code implementations • 14 Apr 2022 • Carlos Lassance, Thibault Formal, Stephane Clinchant

Second, CCSA can be used as a binary quantization method and we propose to combine it with the recent graph based ANN techniques.

Image Retrieval Information Retrieval +2

Paper
Add Code

Match Your Words! A Study of Lexical Matching in Neural Information Retrieval

no code implementations • 10 Dec 2021 • Thibault Formal, Benjamin Piwowarski, Stéphane Clinchant

Neural Information Retrieval models hold the promise to replace lexical matching models, e. g. BM25, in modern search engines.

Information Retrieval Retrieval

Paper
Add Code

SPLADE v2: Sparse Lexical and Expansion Model for Information Retrieval

1 code implementation • 21 Sep 2021 • Thibault Formal, Carlos Lassance, Benjamin Piwowarski, Stéphane Clinchant

Meanwhile, there has been a growing interest in learning \emph{sparse} representations for documents and queries, that could inherit from the desirable properties of bag-of-words models such as the exact matching of terms and the efficiency of inverted indexes.

Ranked #5 on Zero-shot Text Search on BEIR