Solving the scalarization issues of Advantage-based Reinforcement Learning Algorithms

In this research, some of the issues that arise from the scalarization of the multi-objective optimization problem in the Advantage Actor Critic (A2C) reinforcement learning algorithm are investigated. The paper shows how a naive scalarization can lead to gradients overlapping... (read more)

