Policy Gradient Methods

89 papers with code • 0 benchmarks • 2 datasets

This task has no description! Would you like to contribute one?

Libraries

Use these libraries to find Policy Gradient Methods models and implementations
2 papers
1,152
2 papers
614
See all 7 libraries.

Self-Improvement for Neural Combinatorial Optimization: Sample without Replacement, but Improvement

grimmlab/gumbeldore 22 Mar 2024

Current methods for end-to-end constructive neural combinatorial optimization usually train a policy using behavior cloning from expert solutions or policy gradient methods from reinforcement learning.

3
22 Mar 2024

Predictable Reinforcement Learning Dynamics through Entropy Rate Minimization

tud-amr/parl 30 Nov 2023

In Reinforcement Learning (RL), agents have no incentive to exhibit predictable behaviors, and are often pushed (through e. g. policy entropy regularization) to randomize their actions in favor of exploration.

1
30 Nov 2023

Clipped-Objective Policy Gradients for Pessimistic Policy Optimization

BIT-aerial-robotics/AquaML 10 Nov 2023

To facilitate efficient learning, policy gradient approaches to deep reinforcement learning (RL) are typically paired with variance reduction measures and strategies for making large but safe policy changes based on a batch of experiences.

74
10 Nov 2023

Oracle Complexity Reduction for Model-free LQR: A Stochastic Variance-Reduced Policy Gradient Approach

jd-anderson/lqr_svrpg 19 Sep 2023

We investigate the problem of learning an $\epsilon$-approximate solution for the discrete-time Linear Quadratic Regulator (LQR) problem via a Stochastic Variance-Reduced Policy Gradient (SVRPG) approach.

1
19 Sep 2023

Learning Zero-Sum Linear Quadratic Games with Improved Sample Complexity and Last-Iterate Convergence

wujiduan/zero-sum-lq-games 8 Sep 2023

Our main results are two-fold: (i) in the deterministic setting, we establish the first global last-iterate linear convergence result for the nested algorithm that seeks NE of zero-sum LQ games; (ii) in the model-free setting, we establish a~$\widetilde{\mathcal{O}}(\epsilon^{-2})$ sample complexity using a single-point ZO estimator.

0
08 Sep 2023

Hindsight-DICE: Stable Credit Assignment for Deep Reinforcement Learning

skandavaidyanath/credit-assignment 21 Jul 2023

Oftentimes, environments for sequential decision-making problems can be quite sparse in the provision of evaluative feedback to guide reinforcement-learning agents.

1
21 Jul 2023

Enabling Efficient, Reliable Real-World Reinforcement Learning with Approximate Physics-Based Models

clearoboticslab/learningwithsimplemodels.jl 16 Jul 2023

We focus on developing efficient and reliable policy optimization strategies for robot learning with real-world data.

2
16 Jul 2023

Efficient Diffusion Policies for Offline Reinforcement Learning

sail-sg/edp NeurIPS 2023

2) It is incompatible with maximum likelihood-based RL algorithms (e. g., policy gradient methods) as the likelihood of diffusion models is intractable.

53
31 May 2023

Client Selection for Federated Policy Optimization with Environment Heterogeneity

shiehshieh/fedpohcs 18 May 2023

This paper investigates the federated version of Approximate PI (API) and derives its error bound, taking into account the approximation error introduced by environment heterogeneity.

0
18 May 2023

Policy Gradient Methods in the Presence of Symmetries and State Abstractions

sahandrez/homomorphic_policy_gradient 9 May 2023

Our policy gradient results allow for leveraging approximate symmetries of the environment for policy optimization.

18
09 May 2023