no code implementations • 3 May 2024 • Luca Viano, Stratis Skoulakis, Volkan Cevher
We present a new algorithm for imitation learning in infinite horizon linear MDPs dubbed ILARL which greatly improves the bound on the number of trajectories that the learner needs to sample from the environment.
no code implementations • 25 Apr 2023 • Fanghui Liu, Luca Viano, Volkan Cevher
In online reinforcement learning (RL), instead of employing standard structural assumptions on Markov decision processes (MDPs), using a certain coverage condition (original from offline RL) is enough to ensure sample-efficient guarantees (Xie et al. 2023).
1 code implementation • 22 Sep 2022 • Paul Rolland, Luca Viano, Norman Schuerhoff, Boris Nikolov, Volkan Cevher
While Reinforcement Learning (RL) aims to train an agent from a reward function in a given environment, Inverse Reinforcement Learning (IRL) seeks to recover the reward function from observing an expert's behavior.
2 code implementations • 22 Sep 2022 • Luca Viano, Angeliki Kamoutsi, Gergely Neu, Igor Krawczuk, Volkan Cevher
Thanks to PPM, we avoid nested policy evaluation and cost updates for online IL appearing in the prior literature.
no code implementations • 15 Sep 2022 • Fanghui Liu, Luca Viano, Volkan Cevher
To be specific, we focus on the value based algorithm with the $\epsilon$-greedy exploration via deep (and two-layer) neural networks endowed by Besov (and Barron) function spaces, respectively, which aims at approximating an $\alpha$-smooth Q-function in a $d$-dimensional feature space.
1 code implementation • 12 Feb 2022 • Luca Viano, Yu-Ting Huang, Parameswaran Kamalaruban, Craig Innes, Subramanian Ramamoorthy, Adrian Weller
Imitation learning (IL) is a popular paradigm for training policies in robotic systems when specifying the reward function is difficult.
no code implementations • 12 Feb 2022 • Luca Viano, Johanni Brea
Abstract object properties and their relations are deeply rooted in human common sense, allowing people to predict the dynamics of the world even in situations that are novel but governed by familiar laws of physics.
Common Sense Reasoning Model-based Reinforcement Learning +3
no code implementations • 29 Sep 2021 • Ahmet Alacaoglu, Luca Viano, Niao He, Volkan Cevher
Our sample complexities also match the best-known results for global convergence of policy gradient and two time-scale actor-critic algorithms in the single agent setting.
1 code implementation • NeurIPS 2021 • Luca Viano, Yu-Ting Huang, Parameswaran Kamalaruban, Adrian Weller, Volkan Cevher
We study the inverse reinforcement learning (IRL) problem under a transition dynamics mismatch between the expert and the learner.