Policy Gradient Methods

90 papers with code • 0 benchmarks • 2 datasets

This task has no description! Would you like to contribute one?

Libraries

Use these libraries to find Policy Gradient Methods models and implementations
2 papers
1,154
2 papers
614
See all 7 libraries.

Most implemented papers

Analysis of the Optimization Landscape of Linear Quadratic Gaussian (LQG) Control

zhengy09/LQG_gradient 8 Feb 2021

This paper revisits the classical Linear Quadratic Gaussian (LQG) control from a modern optimization perspective.

Policy Gradient Methods in the Presence of Symmetries and State Abstractions

sahandrez/homomorphic_policy_gradient 9 May 2023

Our policy gradient results allow for leveraging approximate symmetries of the environment for policy optimization.

Dual Learning for Machine Translation

NonameAuPlatal/Dual_Learning NeurIPS 2016

Based on the feedback signals generated during this process (e. g., the language-model likelihood of the output of a model, and the reconstruction error of the original sentence after the primal and dual translations), we can iteratively update the two models until convergence (e. g., using the policy gradient methods).

Reproducibility of Benchmarked Deep Reinforcement Learning Tasks for Continuous Control

Breakend/ReproducibilityInContinuousPolicyGradientMethods 10 Aug 2017

We investigate and discuss: the significance of hyper-parameters in policy gradients for continuous control, general variance in the algorithms, and reproducibility of reported results.

Cold-Start Reinforcement Learning with Softmax Policy Gradient

jacksonchen1998/Cold-Start-Reinforcement-Learning-with-Softmax-Policy-Gradient NeurIPS 2017

Policy-gradient approaches to reinforcement learning have two common and undesirable overhead procedures, namely warm-start training and sample variance reduction.

Hindsight policy gradients

paulorauber/hpg ICLR 2019

A reinforcement learning agent that needs to pursue different goals across episodes requires a goal-conditional policy.

Run, skeleton, run: skeletal model in a physics-based simulation

Scitator/Run-Skeleton-Run 18 Nov 2017

In this paper, we present our approach to solve a physics-based reinforcement learning challenge "Learning to Run" with objective to train physiologically-based human model to navigate a complex obstacle course as quickly as possible.

Divide-and-Conquer Reinforcement Learning

dibyaghosh/dnc ICLR 2018

In this paper, we develop a novel algorithm that instead partitions the initial state space into "slices", and optimizes an ensemble of policies, each on a different slice.

Bayesian Policy Gradients via Alpha Divergence Dropout Inference

Breakend/BayesianPolicyGradients 6 Dec 2017

Policy gradient methods have had great success in solving continuous control tasks, yet the stochastic nature of such problems makes deterministic value estimation difficult.

Clipped Action Policy Gradient

pfnet-research/capg ICML 2018

We propose a policy gradient estimator that exploits the knowledge of actions being clipped to reduce the variance in estimation.