Policy Gradient Methods

Reinforcement Learning • 24 methods

Policy Gradient Methods try to optimize the policy function directly in reinforcement learning. This contrasts with, for example, Q-Learning, where the policy manifests itself as maximizing a value function. Below you can find a continuously updating catalog of policy gradient methods.

Method Year Papers
2017 614
2015 189
1999 159
2018 90
2016 70
2015 69
2016 48
2018 43
2017 33
2018 15
2014 13
2018 10
2016 10
2018 6
2017 2
2017 2
2020 2
2017 1
2018 1
2020 1
2021 1
2021 1
2000 1