no code implementations • 4 Dec 2020 • Wangshu Zhu, Andre Rosendo
To address this issue we present a PPO variant, named Proximal Policy Optimization Smooth Algorithm (PPOS), and its critical improvement is the use of a functional clipping method instead of a flat clipping method.