Search Results for author: Jean-Michel Renders

Found 17 papers, 6 papers with code

Unbiased Learning to Rank Meets Reality: Lessons from Baidu's Large-Scale Search Dataset

1 code implementation3 Apr 2024 Philipp Hager, Romain Deffayet, Jean-Michel Renders, Onno Zoeter, Maarten de Rijke

However, these gains in click prediction do not translate to enhanced ranking performance on expert relevance annotations, implying that conclusions strongly depend on how success is measured in this benchmark.

Learning-To-Rank

SLIM: Skill Learning with Multiple Critics

no code implementations1 Feb 2024 David Emukpere, Bingbing Wu, Julien Perez, Jean-Michel Renders

As it requires impacting a possibly large set of degrees of freedom composing the environment, mutual information maximization fails alone in producing useful and safe manipulation behaviors.

Hierarchical Reinforcement Learning

SARDINE: A Simulator for Automated Recommendation in Dynamic and Interactive Environments

1 code implementation28 Nov 2023 Romain Deffayet, Thibaut Thonet, Dongyoon Hwang, Vassilissa Lehoux, Jean-Michel Renders, Maarten de Rijke

Simulators can provide valuable insights for researchers and practitioners who wish to improve recommender systems, because they allow one to easily tweak the experimental setup in which recommender systems operate, and as a result lower the cost of identifying general trends and uncovering novel findings about the candidate methods.

counterfactual Learning-To-Rank +1

Distributional Reinforcement Learning with Dual Expectile-Quantile Regression

no code implementations26 May 2023 Sami Jullien, Romain Deffayet, Jean-Michel Renders, Paul Groth, Maarten de Rijke

Motivated by the efficiency of $L_2$-based learning, we propose to jointly learn expectiles and quantiles of the return distribution in a way that allows efficient learning while keeping an estimate of the full distribution of returns.

Continuous Control Distributional Reinforcement Learning +3

An Offline Metric for the Debiasedness of Click Models

2 code implementations19 Apr 2023 Romain Deffayet, Philipp Hager, Jean-Michel Renders, Maarten de Rijke

We prove that debiasedness is a necessary condition for recovering unbiased and consistent relevance scores and for the invariance of click prediction under covariate shift.

counterfactual Learning-To-Rank +1

Generative Slate Recommendation with Reinforcement Learning

no code implementations20 Jan 2023 Romain Deffayet, Thibaut Thonet, Jean-Michel Renders, Maarten de Rijke

Our findings suggest that representation learning using generative models is a promising direction towards generalizable RL-based slate recommendation.

Recommendation Systems reinforcement-learning +2

Offline Evaluation for Reinforcement Learning-based Recommendation: A Critical Issue and Some Alternatives

no code implementations3 Jan 2023 Romain Deffayet, Thibaut Thonet, Jean-Michel Renders, Maarten de Rijke

In this paper, we argue that the paradigm commonly adopted for offline evaluation of sequential recommender systems is unsuitable for evaluating reinforcement learning-based recommenders.

Offline RL Recommendation Systems +2

Pareto-Optimal Fairness-Utility Amortizations in Rankings with a DBN Exposure Model

no code implementations16 May 2022 Till Kletti, Jean-Michel Renders, Patrick Loiseau

We lay out the structure of a new geometrical object (the DBN-expohedron), and propose for it a Carath\'eodory decomposition algorithm of complexity $O(n^3)$, where $n$ is the number of documents to rank.

Fairness Open-Ended Question Answering

Introducing the Expohedron for Efficient Pareto-optimal Fairness-Utility Amortizations in Repeated Rankings

1 code implementation7 Feb 2022 Till Kletti, Jean-Michel Renders, Patrick Loiseau

Such a decomposition makes it possible to express any feasible target exposure as a distribution over at most $n$ rankings.

Fairness

SmoothI: Smooth Rank Indicators for Differentiable IR Metrics

1 code implementation3 May 2021 Thibaut Thonet, Yagmur Gizem Cinar, Eric Gaussier, Minghan Li, Jean-Michel Renders

To address this shortcoming, we propose SmoothI, a smooth approximation of rank indicators that serves as a basic building block to devise differentiable approximations of IR metrics.

Information Retrieval Learning-To-Rank +1

Real-Time Optimization Of Web Publisher RTB Revenues

no code implementations12 Jun 2020 Pedro Chahuara, Nicolas Grislain, Grégoire Jauvion, Jean-Michel Renders

The real-world challenges we had to tackle consist mainly of tracking the dependencies on both the user and placement in an highly non-stationary environment and of dealing with censored bid observations.

Modeling ASR Ambiguity for Dialogue State Tracking Using Word Confusion Networks

1 code implementation3 Feb 2020 Vaishali Pal, Fabien Guillot, Manish Shrivastava, Jean-Michel Renders, Laurent Besacier

Spoken dialogue systems typically use a list of top-N ASR hypotheses for inferring the semantic meaning and tracking the state of the dialogue.

Dialogue State Tracking Spoken Dialogue Systems

Active Search for High Recall: a Non-Stationary Extension of Thompson Sampling

no code implementations27 Dec 2017 Jean-Michel Renders

We consider the problem of Active Search, where a maximum of relevant objects - ideally all relevant objects - should be retrieved with the minimum effort or minimum time.

Multi-Armed Bandits Thompson Sampling

Efficient Online Learning for Optimizing Value of Information: Theory and Application to Interactive Troubleshooting

no code implementations16 Mar 2017 Yuxin Chen, Jean-Michel Renders, Morteza Haghir Chehreghani, Andreas Krause

We consider the optimal value of information (VoI) problem, where the goal is to sequentially select a set of tests with a minimal cost, so that one can efficiently make the best decision based on the observed outcomes.

Joint Event Detection and Entity Resolution: a Virtuous Cycle

no code implementations18 Jul 2016 Matthias Galle, Jean-Michel Renders, Guillaume Jacquet

Clustering web documents has numerous applications, such as aggregating news articles into meaningful events, detecting trends and hot topics on the Web, preserving diversity in search results, etc.

Clustering Entity Resolution +1

LSTM-based Mixture-of-Experts for Knowledge-Aware Dialogues

no code implementations WS 2016 Phong Le, Marc Dymetman, Jean-Michel Renders

We introduce an LSTM-based method for dynamically integrating several word-prediction experts to obtain a conditional language model which can be good simultaneously at several subtasks.

Language Modelling Question Answering

Task-Driven Linguistic Analysis based on an Underspecified Features Representation

no code implementations LREC 2012 Stasinos Konstantopoulos, Valia Kordoni, Nicola Cancedda, Vangelis Karkaletsis, Dietrich Klakow, Jean-Michel Renders

In this paper we explore a task-driven approach to interfacing NLP components, where language processing is guided by the end-task that each application requires.

Cannot find the paper you are looking for? You can Submit a new open access paper.