no code implementations • 20 Oct 2022 • Antonio Terpin, Nicolas Lanzetti, Batuhan Yardim, Florian Dörfler, Giorgia Ramponi
In this paper, we explore optimal transport discrepancies (which include the Wasserstein distance) to define trust regions, and we propose a novel algorithm - Optimal Transport Trust Region Policy Optimization (OT-TRPO) - for continuous state-action spaces.