no code implementations • 4 Dec 2023 • Jimmy Lin, Tommaso Teofili
In this work, we explore the contrarian approach of performing top-$k$ retrieval on dense vector representations using inverted indexes.
no code implementations • 29 Aug 2023 • Jimmy Lin, Ronak Pradeep, Tommaso Teofili, Jasper Xian
We provide a reproducible, end-to-end demonstration of vector search with OpenAI embeddings using Lucene on the popular MS MARCO passage ranking test collection.
no code implementations • 24 Apr 2023 • Xueguang Ma, Tommaso Teofili, Jimmy Lin
With Pyserini, which provides a Python interface to Anserini, users gain access to both sparse and dense retrieval models, as Pyserini implements bindings to the Faiss vector search library alongside Lucene inverted indexes in a uniform, consistent interface.
1 code implementation • 24 Mar 2022 • Tommaso Teofili, Donatella Firmani, Nick Koudas, Vincenzo Martello, Paolo Merialdo, Divesh Srivastava
CERTA builds on a probabilistic framework that aims at computing the explanations evaluating the outcomes produced by using perturbed copies of the input records.
1 code implementation • 26 Apr 2021 • Rob Geada, Tommaso Teofili, Rui Vieira, Rebecca Whitworth, Daniele Zonca
TrustyAI is an initiative which looks into explainable artificial intelligence (XAI) solutions to address this issue of explainability in the context of both AI models and decision services.
Explainable artificial intelligence Explainable Artificial Intelligence (XAI)
no code implementations • 22 Oct 2019 • Tommaso Teofili, Jimmy Lin
We demonstrate three approaches for adapting the open-source Lucene search library to perform approximate nearest-neighbor search on arbitrary dense vectors, using similarity search on word embeddings as a case study.
no code implementations • 4 Sep 2019 • Tommaso Teofili, Niyati Chhaya
Distributed representations of words have shown to be useful to improve the effectiveness of IR systems in many sub-tasks like query expansion, retrieval and ranking.