1 code implementation • 18 Jun 2020 • Oscar de Lima, Hansal Shah, Ting-Sheng Chu, Brian Fogelson
Also, our algorithm is able to outperform IDQN baseline in the scenario where we have a variable number of passengers and cars in each episode.
Multi-agent Reinforcement Learning reinforcement-learning +1