Deep Deterministic Policy Gradient

Introduced by Lillicrap et al. in Continuous control with deep reinforcement learning

DDPG, or Deep Deterministic Policy Gradient, is an actor-critic, model-free algorithm based on the deterministic policy gradient that can operate over continuous action spaces. It combines the actor-critic approach with insights from DQNs: in particular, the insights that 1) the network is trained off-policy with samples from a replay buffer to minimize correlations between samples, and 2) the network is trained with a target Q network to give consistent targets during temporal difference backups. DDPG makes use of the same ideas along with batch normalization.

Source: Continuous control with deep reinforcement learning

Latest Papers

PAPER DATE
Truly Deterministic Policy Optimization
Anonymous
2021-01-01
Reinforcement Learning for Control of Valves
| Rajesh Siraskar
2020-12-29
Policy Gradient RL Algorithms as Directed Acyclic Graphs
| Juan Jose Garau Luis
2020-12-14
Efficient Reservoir Management through Deep Reinforcement Learning
Xinrun WangTarun NairHaoyang LiYuh Sheng Reuben WongNachiket KelkarSrinivas VaidyanathanRajat NayakBo AnJagdish KrishnaswamyMilind Tambe
2020-12-07
FinRL: A Deep Reinforcement Learning Library for Automated Stock Trading in Quantitative Finance
| Xiao-Yang LiuHongyang YangQian ChenRunjia ZhangLiuqing YangBowen XiaoChristina Dan Wang
2020-11-19
Tonic: A Deep Reinforcement Learning Library for Fast Prototyping and Benchmarking
| Fabio Pardo
2020-11-15
Recurrent Distributed Reinforcement Learning for Partially Observable Robotic Assembly
Jieliang LuoHui Li
2020-10-15
Hindsight Experience Replay with Kronecker Product Approximate Curvature
Dhuruva Priyan G MAbhik SinglaShalabh Bhatnagar
2020-10-09
Knowledge-Assisted Deep Reinforcement Learning in 5G Scheduler Design: From Theoretical Framework to Implementation
Zhouyou GuChangyang SheWibowo HardjawanaSimon LumbDavid McKechnieTodd EsseryBranka Vucetic
2020-09-17
Human-like Energy Management Based on Deep Reinforcement Learning and Historical Driving Experiences
Teng LiuXiaolin TangXiaosong HuWenhao TanJinwei Zhang
2020-07-16
Regularly Updated Deterministic Policy Gradient Algorithm
Shuai HanWenbo ZhouShuai LüJiayu Yu
2020-07-01
Distributed Uplink Beamforming in Cell-Free Networks Using Deep Reinforcement Learning
Firas FredjYasser Al-EryaniSetareh MaghsudiMohamed AkroutEkram Hossain
2020-06-26
Noise, overestimation and exploration in Deep Reinforcement Learning
Rafael Stekolshchik
2020-06-25
The Effect of Multi-step Methods on Overestimation in Deep Reinforcement Learning
Lingheng MengRob GorbetDana Kulić
2020-06-23
Reducing Estimation Bias via Weighted Delayed Deep Deterministic Policy Gradient
Qiang HeXinwen Hou
2020-06-18
An online evolving framework for advancing reinforcement-learning based automated vehicle control
Teawon HanSubramanya NageshraoDimitar P. FilevUmit Ozguner
2020-06-15
Optimization-driven Deep Reinforcement Learning for Robust Beamforming in IRS-assisted Wireless Communications
Jiaye LinYuze ZouXiaoru DongShimin GongDinh Thai HoangDusit Niyato
2020-05-25
PBCS : Efficient Exploration and Exploitation Using a Synergy between Reinforcement Learning and Motion Planning
Guillaume MatheronNicolas PerrinOlivier Sigaud
2020-04-24
Model-based actor-critic: GAN + DRL (actor-critic) => AGI
Aras Dargazany
2020-04-04
Obstacle Avoidance and Navigation Utilizing Reinforcement Learning with Reward Shaping
Daniel ZhangColleen P. Bailey
2020-03-28
Accelerating Deep Reinforcement Learning With the Aid of Partial Model: Energy-Efficient Predictive Video Streaming
Dong LiuJianyu ZhaoChenyang YangLajos Hanzo
2020-03-21
Robust Deep Reinforcement Learning against Adversarial Perturbations on State Observations
| Huan ZhangHongge ChenChaowei XiaoBo LiMingyan LiuDuane BoningCho-Jui Hsieh
2020-03-19
Adaptive Discretization for Continuous Control using Particle Filtering Policy Network
| Pei XuIoannis Karamouzas
2020-03-16
Online Meta-Critic Learning for Off-Policy Actor-Critic Methods
Wei ZhouYiying LiYongxin YangHuaimin WangTimothy M. Hospedales
2020-03-11
Dynamic Experience Replay
Jieliang LuoHui Li
2020-03-04

Tasks

Categories