1 code implementation • 25 Jan 2023 • Tim Pearce, Tabish Rashid, Anssi Kanervisto, Dave Bignell, Mingfei Sun, Raluca Georgescu, Sergio Valcarcel Macua, Shan Zheng Tan, Ida Momennejad, Katja Hofmann, Sam Devlin
This paper studies their application as observation-to-action models for imitating human behaviour in sequential environments.
no code implementations • 23 Oct 2021 • Sergio Valcarcel Macua, Ian Davies, Aleksi Tukiainen, Enrique Munoz de Cote
We propose a fully distributed actor-critic architecture, named Diff-DAC, with application to multitask reinforcement learning (MRL).
no code implementations • 9 Oct 2019 • Marcin B. Tomczak, Sergio Valcarcel Macua, Enrique Munoz de Cote, Peter Vrancx
In this work we establish conditions under which the parametric approximation of the critic does not introduce bias to the updates of surrogate objective.
no code implementations • ICLR 2018 • Sergio Valcarcel Macua, Javier Zazo, Santiago Zazo
This is a considerable improvement over the previously standard approach for the CL analysis of MPGs, which gives no approximate solution if no NE belongs to the chosen parametric family, and which is practical only for simple parametric forms.
no code implementations • 28 Oct 2017 • Sergio Valcarcel Macua, Aleksi Tukiainen, Daniel García-Ocaña Hernández, David Baldazo, Enrique Munoz de Cote, Santiago Zazo
We propose a fully distributed actor-critic algorithm approximated by deep neural networks, named \textit{Diff-DAC}, with application to single-task and to average multitask reinforcement learning (MRL).
no code implementations • 30 Dec 2013 • Sergio Valcarcel Macua, Jianshu Chen, Santiago Zazo, Ali H. Sayed
We apply diffusion strategies to develop a fully-distributed cooperative reinforcement learning algorithm in which agents in a network communicate only with their immediate neighbors to improve predictions about their environment.