1 code implementation • 18 Nov 2021 • Minghan Li, Diana Nicoleta Popa, Johan Chagnon, Yagmur Gizem Cinar, Eric Gaussier
On a wide range of natural language processing and information retrieval tasks, transformer-based models, particularly pre-trained language models like BERT, have demonstrated tremendous effectiveness.
1 code implementation • 3 May 2021 • Thibaut Thonet, Yagmur Gizem Cinar, Eric Gaussier, Minghan Li, Jean-Michel Renders
To address this shortcoming, we propose SmoothI, a smooth approximation of rank indicators that serves as a basic building block to devise differentiable approximations of IR metrics.
1 code implementation • CONLL 2020 • Romain Couillet, Yagmur Gizem Cinar, Eric Gaussier, Muhammad Imran
This article establishes that, unlike the legacy tf*idf representation, recent natural language representations (word embedding vectors) tend to exhibit a so-called \textit{concentration of measure phenomenon}, in the sense that, as the representation size $p$ and database size $n$ are both large, their behavior is similar to that of large dimensional Gaussian random vectors.