4 code implementations • 10 Nov 2021 • Zhaoxing Yang, Rong Ding, Haiming Jin, Yifei Wei, Haoyi You, Guiyun Fan, Xiaoying Gan, Xinbing Wang
In addition, with such modularization, the training algorithm of DeCOM separates the original constrained optimization into an unconstrained optimization on reward and a constraints satisfaction problem on costs.
Multi-agent Reinforcement Learning reinforcement-learning +1