Methodology

Policy Gradient Methods

89 papers with code • 0 benchmarks • 2 datasets

This task has no description! Would you like to contribute one?

Benchmarks

Add a Result

These leaderboards are used to track progress in Policy Gradient Methods

No evaluation results yet. Help compare methods by submitting evaluation metrics.

Libraries

Use these libraries to find Policy Gradient Methods models and implementations

DLR-RM/stable-baselines3

3 papers

7,867

hill-a/stable-baselines

2 papers

4,038

chainer/chainerrl

2 papers

1,152

tensorlayer/RLzoo

2 papers

614

See all 7 libraries.

Datasets

Latest papers

Most implemented Social Latest No code

Self-Improvement for Neural Combinatorial Optimization: Sample without Replacement, but Improvement

grimmlab/gumbeldore • • 22 Mar 2024

Current methods for end-to-end constructive neural combinatorial optimization usually train a policy using behavior cloning from expert solutions or policy gradient methods from reinforcement learning.

22 Mar 2024

Paper
Code

Predictable Reinforcement Learning Dynamics through Entropy Rate Minimization

tud-amr/parl • • 30 Nov 2023

In Reinforcement Learning (RL), agents have no incentive to exhibit predictable behaviors, and are often pushed (through e. g. policy entropy regularization) to randomize their actions in favor of exploration.

30 Nov 2023

Paper
Code

Clipped-Objective Policy Gradients for Pessimistic Policy Optimization

BIT-aerial-robotics/AquaML • • 10 Nov 2023

To facilitate efficient learning, policy gradient approaches to deep reinforcement learning (RL) are typically paired with variance reduction measures and strategies for making large but safe policy changes based on a batch of experiences.

10 Nov 2023

Paper
Code

Oracle Complexity Reduction for Model-free LQR: A Stochastic Variance-Reduced Policy Gradient Approach

jd-anderson/lqr_svrpg • 19 Sep 2023

We investigate the problem of learning an $\epsilon$-approximate solution for the discrete-time Linear Quadratic Regulator (LQR) problem via a Stochastic Variance-Reduced Policy Gradient (SVRPG) approach.

19 Sep 2023

Paper
Code

Learning Zero-Sum Linear Quadratic Games with Improved Sample Complexity and Last-Iterate Convergence

wujiduan/zero-sum-lq-games • • 8 Sep 2023

Our main results are two-fold: (i) in the deterministic setting, we establish the first global last-iterate linear convergence result for the nested algorithm that seeks NE of zero-sum LQ games; (ii) in the model-free setting, we establish a~$\widetilde{\mathcal{O}}(\epsilon^{-2})$ sample complexity using a single-point ZO estimator.

08 Sep 2023

Paper
Code

Hindsight-DICE: Stable Credit Assignment for Deep Reinforcement Learning

skandavaidyanath/credit-assignment • • 21 Jul 2023

Oftentimes, environments for sequential decision-making problems can be quite sparse in the provision of evaluative feedback to guide reinforcement-learning agents.

21 Jul 2023

Paper
Code

Enabling Efficient, Reliable Real-World Reinforcement Learning with Approximate Physics-Based Models

clearoboticslab/learningwithsimplemodels.jl • 16 Jul 2023

We focus on developing efficient and reliable policy optimization strategies for robot learning with real-world data.

16 Jul 2023

Paper
Code

Efficient Diffusion Policies for Offline Reinforcement Learning

sail-sg/edp • • NeurIPS 2023

2) It is incompatible with maximum likelihood-based RL algorithms (e. g., policy gradient methods) as the likelihood of diffusion models is intractable.

31 May 2023

Paper
Code

Client Selection for Federated Policy Optimization with Environment Heterogeneity

shiehshieh/fedpohcs • • 18 May 2023

This paper investigates the federated version of Approximate PI (API) and derives its error bound, taking into account the approximation error introduced by environment heterogeneity.

18 May 2023

Paper
Code

Policy Gradient Methods in the Presence of Symmetries and State Abstractions

sahandrez/homomorphic_policy_gradient • • 9 May 2023

Our policy gradient results allow for leveraging approximate symmetries of the environment for policy optimization.

09 May 2023

Paper
Code

Policy Gradient Methods

Benchmarks Add a Result

Libraries

Datasets

Latest papers

Content

Benchmarks

Add a Result