Search Results for author: Siqiao Mu

Found 1 papers, 0 papers with code

On the Second-Order Convergence of Biased Policy Gradient Algorithms

no code implementations • 5 Nov 2023 • Siqiao Mu, Diego Klabjan

Since the objective functions of reinforcement learning problems are typically highly nonconvex, it is desirable that policy gradient, the most popular algorithm, escapes saddle points and arrives at second-order stationary points.

Policy Gradient Methods

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.