no code implementations • 14 Feb 2022 • Mirco Mutti, Riccardo De Santi, Emanuele Rossi, Juan Felipe Calderon, Michael Bronstein, Marcello Restelli
In this setting, the agent can take a finite amount of reward-free interactions from a subset of these environments.