no code implementations • 16 Nov 2023 • Astrid Vanneste, Simon Vanneste, Olivier Vasseur, Robin Janssens, Mattias Billast, Ali Anwar, Kevin Mets, Tom De Schepper, Siegfried Mercelis, Peter Hellinckx
We demonstrate our approach on two scenarios and compare the resulting path with path planning using a Frenet frame and path planning based on a proximal policy optimization (PPO) agent.
no code implementations • 9 Aug 2023 • Astrid Vanneste, Simon Vanneste, Kevin Mets, Tom De Schepper, Siegfried Mercelis, Peter Hellinckx
We do this comparison in the context of communication learning using gradients from other agents and perform tests on several environments.
no code implementations • 9 Aug 2023 • Astrid Vanneste, Thomas Somers, Simon Vanneste, Kevin Mets, Tom De Schepper, Siegfried Mercelis, Peter Hellinckx
Therefore, we analyse the communication protocol used by the agents that use the mean message encoder and can conclude that the agents use a combination of an exponential and a logarithmic function in their communication policy to avoid the loss of important information after applying the mean message encoder.
no code implementations • 12 Apr 2022 • Astrid Vanneste, Simon Vanneste, Kevin Mets, Tom De Schepper, Siegfried Mercelis, Steven Latré, Peter Hellinckx
The most common approach to allow learned communication between agents is the use of a differentiable communication channel that allows gradients to flow between agents as a form of feedback.
Multi-agent Reinforcement Learning reinforcement-learning +1
no code implementations • 29 Oct 2021 • Simon Vanneste, Gauthier de Borrekens, Stig Bosmans, Astrid Vanneste, Kevin Mets, Siegfried Mercelis, Steven Latré, Peter Hellinckx
In this paper, we investigate independent Q-learning (IQL) without communication and differentiable inter-agent learning (DIAL) with learned communication on an adaptive traffic control system (ATCS).
no code implementations • 29 Oct 2021 • Astrid Vanneste, Wesley Van Wijnsberghe, Simon Vanneste, Kevin Mets, Siegfried Mercelis, Steven Latré, Peter Hellinckx
We look at the difference in performance between communication that is private for a team and communication that can be overheard by the other team.
Multi-agent Reinforcement Learning reinforcement-learning +1
no code implementations • 12 Jun 2020 • Simon Vanneste, Astrid Vanneste, Kevin Mets, Tom De Schepper, Ali Anwar, Siegfried Mercelis, Steven Latré, Peter Hellinckx
The credit assignment problem, the non-stationarity of the communication environment and the creation of influenceable agents are major challenges within this research field which need to be overcome in order to learn a valid communication protocol.