no code implementations • 4 May 2021 • Sajad Khodadadian, Prakirt Raj Jhunjhunwala, Sushil Mahavir Varma, Siva Theja Maguluri
We further improve this convergence result by introducing a variant of Natural Policy Gradient with adaptive step sizes.
Policy Gradient Methods reinforcement-learning +1