no code implementations • 18 Sep 2019 • Olivier Jeunen, Dmytro Mykhaylov, David Rohde, Flavian vasile, Alexandre Gilotte, Martin Bompaire
In order to handle this "bandit-feedback" setting, several Counterfactual Risk Minimisation (CRM) methods have been proposed in recent years, that attempt to estimate the performance of different policies on historical data.
no code implementations • 24 Apr 2019 • Dmytro Mykhaylov, David Rohde, Flavian vasile, Martin Bompaire, Olivier Jeunen
There are three quite distinct ways to train a machine learning model on recommender system logs.