no code implementations • 4 Jan 2021 • Matthieu Jedor, Jonathan Louëdec, Vianney Perchet
On the other hand, this heuristic performs reasonably well in practice and it even has sublinear, and even near-optimal, regret bounds in some very specific linear contextual and Bayesian bandit models.
no code implementations • 28 Dec 2020 • Matthieu Jedor, Jonathan Louëdec, Vianney Perchet
Continuously learning and leveraging the knowledge accumulated from prior tasks in order to improve future performance is a long standing machine learning problem.