Policy Gradient Methods

90 papers with code • 0 benchmarks • 2 datasets

This task has no description! Would you like to contribute one?

Libraries

Use these libraries to find Policy Gradient Methods models and implementations
2 papers
1,154
2 papers
614
See all 7 libraries.

Latest papers with no code

Actor-Critic Reinforcement Learning with Phased Actor

no code yet • 18 Apr 2024

We prove qualitative properties of PAAC for learning convergence of the value and policy, solution optimality, and stability of system dynamics.

Intervention-Assisted Policy Gradient Methods for Online Stochastic Queuing Network Optimization: Technical Report

no code yet • 5 Apr 2024

This framework combines the learning power of neural networks with the guaranteed stability of classical control policies for SQNs.

Elementary Analysis of Policy Gradient Methods

no code yet • 4 Apr 2024

Projected policy gradient under the simplex parameterization, policy gradient and natural policy gradient under the softmax parameterization, are fundamental algorithms in reinforcement learning.

ReAct Meets ActRe: When Language Agents Enjoy Training Data Autonomy

no code yet • 21 Mar 2024

In WebShop, the 1-shot performance of the A$^3$T agent matches human average, and 4 rounds of iterative refinement lead to the performance approaching human experts.

Global Optimality without Mixing Time Oracles in Average-reward RL via Multi-level Actor-Critic

no code yet • 18 Mar 2024

In the context of average-reward reinforcement learning, the requirement for oracle knowledge of the mixing time, a measure of the duration a Markov chain under a fixed policy needs to achieve its stationary distribution-poses a significant challenge for the global convergence of policy gradient methods.

Global Convergence Guarantees for Federated Policy Gradient Methods with Adversaries

no code yet • 15 Mar 2024

Federated Reinforcement Learning (FRL) allows multiple agents to collaboratively build a decision making policy without sharing raw trajectories.

Provable Policy Gradient Methods for Average-Reward Markov Potential Games

no code yet • 9 Mar 2024

We prove that both algorithms based on independent policy gradient and independent natural policy gradient converge globally to a Nash equilibrium for the average reward criterion.

Fill-and-Spill: Deep Reinforcement Learning Policy Gradient Methods for Reservoir Operation Decision and Control

no code yet • 7 Mar 2024

Changes in demand, various hydrological inputs, and environmental stressors are among the issues that water managers and policymakers face on a regular basis.

Stabilizing Policy Gradients for Stochastic Differential Equations via Consistency with Perturbation Process

no code yet • 7 Mar 2024

Nevertheless, when applying policy gradients to SDEs, since the policy gradient is estimated on a finite set of trajectories, it can be ill-defined, and the policy behavior in data-scarce regions may be uncontrolled.

Towards Provable Log Density Policy Gradient

no code yet • 3 Mar 2024

In this work, we argue that this residual term is significant and correcting for it could potentially improve sample-complexity of reinforcement learning methods.