QPLEX: Duplex Dueling Multi-Agent Q-Learning

We explore value-based multi-agent reinforcement learning (MARL) in the popular paradigm of centralized training with decentralized execution (CTDE). CTDE has an important concept, Individual-Global-Max (IGM) principle, which requires the consistency between joint and local action selections to support efficient local decision-making... (read more)

PDF Abstract ICLR 2021 PDF ICLR 2021 Abstract

Results from the Paper

  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods used in the Paper

Dense Connections
Feedforward Networks
Double Q-learning
Off-Policy TD Control
Off-Policy TD Control
Dueling Network
Q-Learning Networks