no code implementations • 3 Dec 2021 • Hepeng Li, Nicholas Clavette, Haibo He
We present an analytical policy update rule that is independent of parametric function approximators.
no code implementations • 15 Oct 2020 • Hepeng Li, Haibo He
By making a series of approximations to the consensus optimization model, we propose a decentralized MARL algorithm, which we call multi-agent TRPO (MATRPO).