Experience Replay

Experience Replay is a replay memory technique used in reinforcement learning where we store the agent’s experiences at each time-step, $e_{t} = \left(s_{t}, a_{t}, r_{t}, s_{t+1}\right)$ in a data-set $D = e_{1}, \cdots, e_{N}$ , pooled over many episodes into a replay memory. We then usually sample the memory randomly for a minibatch of experience, and use this to learn off-policy, as with Deep Q-Networks. This tackles the problem of autocorrelation leading to unstable training, by making the problem more like a supervised learning problem.

Image Credit: Hands-On Reinforcement Learning with Python, Sudharsan Ravichandiran

Latest Papers

PAPER DATE
Self-Adapting Recurrent Models for Object Pushing from Learning in Simulation
Lin CongMichael GörnerPhilipp RuppelHongzhuo LiangNorman HendrichJianwei Zhang
2020-07-27
Learning Compositional Neural Programs for Continuous Control
Thomas PierrotNicolas PerrinFeryal BehbahaniAlexandre LaterreOlivier SigaudKarim BeguirNando de Freitas
2020-07-27
Complex Robotic Manipulation via Graph-Based Hindsight Goal Generation
Zhenshan BingMatthias BruckerFabrice O. MorinKai HuangAlois Knoll
2020-07-27
EMaQ: Expected-Max Q-Learning Operator for Simple Yet Effective Offline and Online RL
Seyed Kamyar Seyed GhasemipourDale SchuurmansShixiang Shane Gu
2020-07-21
Beyond Prioritized Replay: Sampling States in Model-Based RL via Simulated Priorities
Jincheng MeiYangchen PanMartha WhiteAmir-massoud FarahmandHengshuai Yao
2020-07-19
Collision Avoidance Robotics Via Meta-Learning (CARML)
Abhiram IyerAravind Mahadevan
2020-07-16
Human-like Energy Management Based on Deep Reinforcement Learning and Historical Driving Experiences
Teng LiuXiaolin TangXiaosong HuWenhao TanJinwei Zhang
2020-07-16
Learning to Sample with Local and Global Contexts in Experience Replay Buffer
Youngmin OhKimin LeeJinwoo ShinEunho YangSung Ju Hwang
2020-07-14
Revisiting Fundamentals of Experience Replay
| William FedusPrajit RamachandranRishabh AgarwalYoshua BengioHugo LarochelleMark RowlandWill Dabney
2020-07-13
An Equivalence between Loss Functions and Non-Uniform Sampling in Experience Replay
| Scott FujimotoDavid MegerDoina Precup
2020-07-12
Batch-level Experience Replay with Review for Continual Learning
| Zheda MaiHyunwoo KimJihwan JeongScott Sanner
2020-07-11
Self-Supervised Policy Adaptation during Deployment
| Nicklas HansenYu SunPieter AbbeelAlexei A. EfrosLerrel PintoXiaolong Wang
2020-07-08
Double Prioritized State Recycled Experience Replay
Fanchen BuDong Eui Chang
2020-07-08
Counterfactual Data Augmentation using Locally Factored Dynamics
| Silviu PitisElliot CreagerAnimesh Garg
2020-07-06
UAV Path Planning for Wireless Data Harvesting: A Deep Reinforcement Learning Approach
Harald BayerleinMirco TheileMarco CaccamoDavid Gesbert
2020-07-01
Developing cooperative policies for multi-stage tasks
Jordan ErskineChris Lehnert
2020-07-01
Regularly Updated Deterministic Policy Gradient Algorithm
Shuai HanWenbo ZhouShuai LüJiayu Yu
2020-07-01
Distributed Uplink Beamforming in Cell-Free Networks Using Deep Reinforcement Learning
Firas FredjYasser Al-EryaniSetareh MaghsudiMohamed AkroutEkram Hossain
2020-06-26
Noise, overestimation and exploration in Deep Reinforcement Learning
Rafael Stekolshchik
2020-06-25
The Effect of Multi-step Methods on Overestimation in Deep Reinforcement Learning
Lingheng MengRob GorbetDana Kulić
2020-06-23
Experience Replay with Likelihood-free Importance Weights
Samarth SinhaJiaming SongAnimesh GargStefano Ermon
2020-06-23
AutoOD: Automated Outlier Detection via Curiosity-guided Search and Self-imitation Learning
Yuening LiZhengzhang ChenDaochen ZhaKaixiong ZhouHaifeng JinHaifeng ChenXia Hu
2020-06-19
Reducing Estimation Bias via Weighted Delayed Deep Deterministic Policy Gradient
Qiang HeXinwen Hou
2020-06-18
Forgetful Experience Replay in Hierarchical Reinforcement Learning from Demonstrations
| Alexey SkrynnikAleksey StaroverovErmek AitygulovKirill AksenovVasilii DavydovAleksandr I. Panov
2020-06-17
Reinforcement Learning with Uncertainty Estimation for Tactical Decision-Making in Intersections
Carl-Johan HoelTommy TramJonas Sjöberg
2020-06-17
Least Squares Regression with Markovian Data: Fundamental Limits and Algorithms
Guy BreslerPrateek JainDheeraj NagarajPraneeth NetrapalliXian Wu
2020-06-16
An online evolving framework for advancing reinforcement-learning based automated vehicle control
Teawon HanSubramanya NageshraoDimitar P. FilevUmit Ozguner
2020-06-15
Human and Multi-Agent collaboration in a human-MARL teaming framework
Neda NavidiFrancois ChabotSagar KurandwadIrv LustigmanVincent RobertGregory SzriftgiserAndrea Schuch
2020-06-12
Balancing a CartPole System with Reinforcement Learning -- A Tutorial
Swagat Kumar
2020-06-08
PlanGAN: Model-based Planning With Sparse Rewards and Multiple Goals
Henry CharlesworthGiovanni Montana
2020-06-01
Manipulating the Distributions of Experience used for Self-Play Learning in Expert Iteration
Dennis J. N. J. SoemersÉric PietteMatthew StephensonCameron Browne
2020-05-30
Optimization-driven Deep Reinforcement Learning for Robust Beamforming in IRS-assisted Wireless Communications
Jiaye LinYuze ZouXiaoru DongShimin GongDinh Thai HoangDusit Niyato
2020-05-25
Experience Augmentation: Boosting and Accelerating Off-Policy Multi-Agent Reinforcement Learning
Zhenhui YeYining ChenGuanghua SongBowei YangShen Fan
2020-05-19
Automating Turbulence Modeling by Multi-Agent Reinforcement Learning
Guido NovatiHugues Lascombes de LaroussilhePetros Koumoutsakos
2020-05-18
Proxy Experience Replay: Federated Distillation for Distributed Reinforcement Learning
Han ChaJihong ParkHyesung KimMehdi BennisSeong-Lyun Kim
2020-05-13
Generalized State-Dependent Exploration for Deep Reinforcement Learning in Robotics
| Antonin RaffinFreek Stulp
2020-05-12
TOMA: Topological Map Abstraction for Reinforcement Learning
Zhao-Heng YinWu-Jun Li
2020-05-11
Discrete-to-Deep Supervised Policy Learning
| Budi KurniawanPeter VamplewMichael PapasimeonRichard DazeleyCameron Foale
2020-05-05
DSAC: Distributional Soft Actor Critic for Risk-Sensitive Reinforcement Learning
Xiaoteng MaLi XiaZhengyuan ZhouJun YangQianchuan Zhao
2020-04-30
Delay-aware Resource Allocation in Fog-assisted IoT Networks Through Reinforcement Learning
Qiang FanJianan BaiHongxia ZhangYang YiLingjia Liu
2020-04-30
PBCS : Efficient Exploration and Exploitation Using a Synergy between Reinforcement Learning and Motion Planning
Guillaume MatheronNicolas PerrinOlivier Sigaud
2020-04-24
Mean-Variance Policy Iteration for Risk-Averse Reinforcement Learning
Shangtong ZhangBo LiuShimon Whiteson
2020-04-22
STDPG: A Spatio-Temporal Deterministic Policy Gradient Agent for Dynamic Routing in SDN
Juan ChenZhiwen XiaoHuanlai XingPenglin DaiShouxi LuoMuhammad Azhar Iqbal
2020-04-21
Dark Experience for General Continual Learning: a Strong, Simple Baseline
Pietro BuzzegaMatteo BoschiniAngelo PorrelloDavide AbatiSimone Calderara
2020-04-15
Model-based actor-critic: GAN + DRL (actor-critic) => AGI
Aras Dargazany
2020-04-04
Obstacle Avoidance and Navigation Utilizing Reinforcement Learning with Reward Shaping
Daniel ZhangColleen P. Bailey
2020-03-28
Multi-Agent Reinforcement Learning for Problems with Combined Individual and Team Reward
Hassam Ullah SheikhLadislau Bölöni
2020-03-24
Evolutionary Population Curriculum for Scaling Multi-Agent Reinforcement Learning
| Qian LongZihan ZhouAbhibav GuptaFei FangYi WuXiaolong Wang
2020-03-23
Continual Graph Learning
| Fan ZhouChengtai CaoTing ZhongKunpeng ZhangGoce TrajcevskiJi Geng
2020-03-22
Adversarial Continual Learning
| Sayna EbrahimiFranziska MeierRoberto CalandraTrevor DarrellMarcus Rohrbach
2020-03-21
Accelerating Deep Reinforcement Learning With the Aid of a Partial Model: Power-Efficient Predictive Video Streaming
Dong LiuJianyu ZhaoChenyang YangLajos Hanzo
2020-03-21
Online Continual Learning on Sequences
German I. ParisiVincenzo Lomonaco
2020-03-20
Robust Deep Reinforcement Learning against Adversarial Perturbations on State Observations
| Huan ZhangHongge ChenChaowei XiaoBo LiMingyan LiuDuane BoningCho-Jui Hsieh
2020-03-19
Deep Multi-Agent Reinforcement Learning for Decentralized Continuous Cooperative Control
Christian Schroeder de WittBei PengPierre-Alexandre KamiennyPhilip TorrWendelin BöhmerShimon Whiteson
2020-03-14
Sample Efficient Reinforcement Learning through Learning from Demonstrations in Minecraft
Christian SchellerYanick SchranerManfred Vogel
2020-03-12
Online Meta-Critic Learning for Off-Policy Actor-Critic Methods
Wei ZhouYiying LiYongxin YangHuaimin WangTimothy M. Hospedales
2020-03-11
Dynamic Experience Replay
Jieliang LuoHui Li
2020-03-04
Reinforcement co-Learning of Deep and Spiking Neural Networks for Energy-Efficient Mapless Navigation with Neuromorphic Hardware
Guangzhi TangNeelesh KumarKonstantinos P. Michmizos
2020-03-02
A Self-Tuning Actor-Critic Algorithm
Tom ZahavyZhongwen XuVivek VeeriahMatteo HesselJunhyuk OhHado van HasseltDavid SilverSatinder Singh
2020-02-28
Data Freshness and Energy-Efficient UAV Navigation Optimization: A Deep Reinforcement Learning Approach
Sarder Fakhrul AbedinMd. Shirajum MunirNguyen H. TranZhu HanChoong Seon Hong
2020-02-21
Disentangling Controllable Object through Video Prediction Improves Visual Reinforcement Learning
Yuanyi ZhongAlexander SchwingJian Peng
2020-02-21
Using Hindsight to Anchor Past Knowledge in Continual Learning
Arslan ChaudhryAlbert GordoPuneet K. DokaniaPhilip TorrDavid Lopez-Paz
2020-02-19
Adaptive Experience Selection for Policy Gradient
Saad MohamadGiovanni Montana
2020-02-17
XCS Classifier System with Experience Replay
Anthony SteinRoland MaierLukas RosenbauerJörg Hähner
2020-02-13
Fast Reinforcement Learning for Anti-jamming Communications
Pei-Gen YeYuan-Gen WangJin LiLiang Xiao
2020-02-13
Soft Hindsight Experience Replay
Qiwei HeLiansheng ZhuangHouqiang Li
2020-02-06
Bootstrapping a DQN Replay Memory with Synthetic Experiences
Wenzel Baron Pilar von PilchauAnthony SteinJörg Hähner
2020-02-04
Stacked Auto Encoder Based Deep Reinforcement Learning for Online Resource Scheduling in Large-Scale MEC Networks
Feibo JiangKezhi WangLi DongCunhua PanKun Yang
2020-01-24
Interpretable End-to-end Urban Autonomous Driving with Latent Deep Reinforcement Learning
| Jianyu ChenShengbo Eben LiMasayoshi Tomizuka
2020-01-23
Cooperative Highway Work Zone Merge Control based on Reinforcement Learning in A Connected and Automated Environment
Tianzhu RenYuanchang XieLiming Jiang
2020-01-21
Discriminator Soft Actor Critic without Extrinsic Rewards
| Daichi NishioDaiki KuyoshiToi TsunedaSatoshi Yamane
2020-01-19
Effects of sparse rewards of different magnitudes in the speed of learning of model-based actor critic methods
Juan VargasLazar AndjelicAmir Barati Farimani
2020-01-18
Continuous-action Reinforcement Learning for Playing Racing Games: Comparing SPG to PPO
Mario S. HolubarMarco A. Wiering
2020-01-15
Deep Reinforcement Learning for Complex Manipulation Tasks with Sparse Feedback
Binyamin Manela
2020-01-12
Reward Engineering for Object Pick and Place Training
Raghav NagpalAchyuthan Unni KrishnanHanshen Yu
2020-01-11
Population-Guided Parallel Policy Search for Reinforcement Learning
Whiyoung JungGiseung ParkYoungchul Sung
2020-01-09
Online Learned Continual Compression with Stacked Quantization Modules
Anonymous
2020-01-01
CrossNorm: On Normalization for Off-Policy Reinforcement Learning
Anonymous
2020-01-01
Dynamically Balanced Value Estimates for Actor-Critic Methods
Anonymous
2020-01-01
Towards Simplicity in Deep Reinforcement Learning: Streamlined Off-Policy Learning
Anonymous
2020-01-01
Exploiting the potential of deep reinforcement learning for classification tasks in high-dimensional and unstructured data
Johan S. Obando-CeronVictor Romero CanoWalter Mayor Toro
2019-12-20
Recruitment-imitation Mechanism for Evolutionary Reinforcement Learning
Shuai LüShuai HanWenbo ZhouJunwei Zhang
2019-12-13
Learning Sparse Representations Incrementally in Deep Reinforcement Learning
J. Fernando Hernandez-GarciaRichard S. Sutton
2019-12-09
On-policy Reinforcement Learning with Entropy Regularization
Jingbin LiuXinyang GuDexiang ZhangShuai Liu
2019-12-02
Reconciling λ-Returns with Experience Replay
| Brett DaleyChristopher Amato
2019-12-01
Online Continual Learning with Maximal Interfered Retrieval
| Rahaf AljundiEugene BelilovskyTinne TuytelaarsLaurent CharlinMassimo CacciaMin LinLucas Page-Caccia
2019-12-01
Curriculum-guided Hindsight Experience Replay
| Meng FangTianyi ZhouYali DuLei HanZhengyou Zhang
2019-12-01
Better Exploration with Optimistic Actor Critic
| Kamil CiosekQuan VuongRobert LoftinKatja Hofmann
2019-12-01
Learning Reward Machines for Partially Observable Reinforcement Learning
Rodrigo Toro IcarteEthan WaldieToryn KlassenRick ValenzanoMargarita CastroSheila Mcilraith
2019-12-01
IMPACT: Importance Weighted Asynchronous Architectures with Clipped Target Networks
Michael LuoJiahao YaoRichard LiawEric LiangIon Stoica
2019-11-30
Distributed Soft Actor-Critic with Multivariate Reward Representation and Knowledge Distillation
Dmitry Akimov
2019-11-29
Which Channel to Ask My Question? Personalized Customer Service RequestStream Routing using DeepReinforcement Learning
Zining LiuChong LongXiaolu LuZehong HuJie ZhangYafang Wang
2019-11-24
Placement Optimization of Aerial Base Stations with Deep Reinforcement Learning
Jin QiuJiangbin LyuLiqun Fu
2019-11-19
Improved Exploration through Latent Trajectory Optimization in Deep Deterministic Policy Gradient
Kevin Sebastian LuckMel VecerikSimon StepputtisHeni Ben AmorJonathan Scholz
2019-11-15
Data-efficient Co-Adaptation of Morphology and Behaviour with Deep Reinforcement Learning
Kevin Sebastian LuckHeni Ben AmorRoberto Calandra
2019-11-15
Deep Reinforcement Learning Based Dynamic Trajectory Control for UAV-assisted Mobile Edge Computing
Liang WangKezhi WangCunhua PanWei XuNauman AslamArumugam Nallanathan
2019-11-10
Policy Continuation with Hindsight Inverse Dynamics
Hao SunZhizhong LiXiaotong LiuDahua LinBolei Zhou
2019-10-30
Overcoming Catastrophic Interference in Online Reinforcement Learning with Dynamic Self-Organizing Maps
Yat Long LoSina Ghiassian
2019-10-29
Better Exploration with Optimistic Actor-Critic
Kamil CiosekQuan VuongRobert LoftinKatja Hofmann
2019-10-28
Task-Oriented Language Grounding for Language Input with Multiple Sub-Goals of Non-Linear Order
| Vladislav KurenkovBulat MaksudovAdil Khan
2019-10-27
Self-Educated Language Agent With Hindsight Experience Replay For Instruction Following
Geoffrey CideronMathieu SeurinFlorian StrubOlivier Pietquin
2019-10-21
Reverse Experience Replay
Egor Rotinov
2019-10-19
Towards More Sample Efficiency in Reinforcement Learning with Data Augmentation
Yijiong LinJiancong HuangMatthieu ZimmerJuan RojasPaul Weng
2019-10-19
Ctrl-Z: Recovering from Instability in Reinforcement Learning
Vibhavari DasagiJake BruceThierry PeynotJürgen Leitner
2019-10-09
TorchBeast: A PyTorch Platform for Distributed RL
| Heinrich KüttlerNantas NardelliThibaut LavrilMarco SelvaticiViswanath SivakumarTim RocktäschelEdward Grefenstette
2019-10-08
Striving for Simplicity and Performance in Off-Policy DRL: Output Normalization and Non-Uniform Sampling
| Che WangYanqiu WuQuan VuongKeith Ross
2019-10-05
Quantized Reinforcement Learning (QUARL)
| Srivatsan KrishnanSharad ChitlangiaMaximilian LamZishen WanAleksandra FaustVijay Janapa Reddi
2019-10-02
Off-policy Multi-step Q-learning
Gabriel KalweitMaria HuegleJoschka Boedecker
2019-09-30
Off-Policy Actor-Critic with Shared Experience Replay
Simon SchmittMatteo HesselKaren Simonyan
2019-09-25
Invariant Transform Experience Replay: Data Augmentation for Deep Reinforcement Learning
| Yijiong LinJiancong HuangMatthieu ZimmerYisheng GuanJuan RojasPaul Weng
2019-09-24
Constrained Attractor Selection Using Deep Reinforcement Learning
Xue-She WangJames D. TurnerBrian P. Mann
2019-09-23
Deterministic Value-Policy Gradients
Qingpeng CaiLing PanPingzhong Tang
2019-09-09
AC-Teach: A Bayesian Actor-Critic Method for Policy Learning with an Ensemble of Suboptimal Teachers
Andrey KurenkovAjay MandlekarRoberto Martin-MartinSilvio SavareseAnimesh Garg
2019-09-09
Deep Reinforcement Learning for Control of Probabilistic Boolean Networks
Georgios PapagiannisSotiris Moschoyiannis
2019-09-07
Efficient Automatic Meta Optimization Search for Few-Shot Learning
Xinyue ZhengPeng WangQigang WangZhongchao shiFeiyu Xu
2019-09-06
Iterative Update and Unified Representation for Multi-Agent Reinforcement Learning
Jiancheng LongHongming ZhangTianyang YuBo Xu
2019-08-16
Online Continual Learning with Maximally Interfered Retrieval
| Rahaf AljundiLucas CacciaEugene BelilovskyMassimo CacciaMin LinLaurent CharlinTinne Tuytelaars
2019-08-11
Incremental Reinforcement Learning --- a New Continuous Reinforcement Learning Frame Based on Stochastic Differential Equation methods
Tianhao ChenLimei ChengYang LiuWenchuan JiaShugen Ma
2019-08-08
Attention Control with Metric Learning Alignment for Image Set-based Recognition
Xiaofeng LiuZhenhua GuoJane YouB. V. K Vijaya Kumar
2019-08-05
Prioritized Guidance for Efficient Multi-Agent Reinforcement Learning Exploration
Qisheng WangQichao Wang
2019-07-18
Improved Reinforcement Learning through Imitation Learning Pretraining Towards Image-based Autonomous Driving
Tianqi WangDong Eui Chang
2019-07-16
Shapley Q-value: A Local Reward Approach to Solve Global Reward Games
| Jianhong WangYuan ZhangTae-Kyun KimYunjie Gu
2019-07-11
Dependency-aware Attention Control for Unconstrained Face Recognition with Image Sets
Xiaofeng LiuB. V. K Vijaya KumarChao YangQingming TangJane You
2019-07-05
Modified Actor-Critics
Erinc MerdivanSten HankeMatthieu Geist
2019-07-02
Variational Quantum Circuits for Deep Reinforcement Learning
| Samuel Yen-Chi ChenChao-Han Huck YangJun QiPin-Yu ChenXiaoli MaHsi-Sheng Goan
2019-06-30
Optimal Use of Experience in First Person Shooter Environments
Matthew Aitchison
2019-06-24
Proximal Distilled Evolutionary Reinforcement Learning
Cristian BodnarBen DayPietro Lió
2019-06-24
Exploring Model-based Planning with Policy Networks
| Tingwu WangJimmy Ba
2019-06-20
Experience Replay Optimization
Daochen ZhaKwei-Herng LaiKaixiong ZhouXia Hu
2019-06-19
Evolutionary Reinforcement Learning for Sample-Efficient Multiagent Coordination
Shauharda KhadkaSomdeb MajumdarSantiago MiretStephen McAleerKagan Tumer
2019-06-18
Goal-conditioned Imitation Learning
Yiming DingCarlos FlorensaMariano PhielippPieter Abbeel
2019-06-13
Deep Reinforcement Learning for Unmanned Aerial Vehicle-Assisted Vehicular Networks
Ming ZhuXiao-Yang LiuXiaodong Wang
2019-06-12
Boosting Soft Actor-Critic: Emphasizing Recent Experience without Forgetting the Past
| Che WangKeith Ross
2019-06-10
Exploration via Hindsight Goal Generation
| Zhizhou RenKefan DongYuan ZhouQiang LiuJian Peng
2019-06-10
Improving Exploration in Soft-Actor-Critic with Normalizing Flows Policies
| Patrick Nadeem WardAriella SmofskyAvishek Joey Bose
2019-06-06
Exploration with Unreliable Intrinsic Reward in Multi-Agent Reinforcement Learning
Wendelin BöhmerTabish RashidShimon Whiteson
2019-06-05
Episodic Memory in Lifelong Language Learning
| Cyprien de Masson d'AutumeSebastian RuderLingpeng KongDani Yogatama
2019-06-03
Harnessing Reinforcement Learning for Neural Motion Planning
Tom JurgensonAviv Tamar
2019-06-01
Prioritized Sequence Experience Replay
Marc BrittainJosh BertramXuxi YangPeng Wei
2019-05-25
Maximum Entropy-Regularized Multi-Goal Reinforcement Learning
| Rui ZhaoXudong SunVolker Tresp
2019-05-21
Combining Experience Replay with Exploration by Random Network Distillation
| Francesco Sovrano
2019-05-18
Bias-Reduced Hindsight Experience Replay with Virtual Goal Prioritization
Binyamin ManelaArmin Biess
2019-05-14
Deep Residual Reinforcement Learning
Shangtong ZhangWendelin BoehmerShimon Whiteson
2019-05-03
Collaborative Evolutionary Reinforcement Learning
| Shauharda KhadkaSomdeb MajumdarTarek NassarZach DwielEvren TumerSantiago MiretYinyin LiuKagan Tumer
2019-05-02
DHER: Hindsight Experience Replay for Dynamic Goals
| Meng FangCheng ZhouBei ShiBoqing GongJia XuTong Zhang
2019-05-01
ACTRCE: Augmenting Experience via Teacher’s Advice
Yuhuai WuHarris ChanJamie KirosSanja FidlerJimmy Ba
2019-05-01
Learning Goal-Conditioned Value Functions with one-step Path rewards rather than Goal-Rewards
Vikas DhimanShurjo BanerjeeJeffrey M SiskindJason J Corso
2019-05-01
Learning agents with prioritization and parameter noise in continuous state and action space
Rajesh DevaraddiG. Srinivasaraghavan
2019-05-01
CEM-RL: Combining evolutionary and gradient-based methods for policy search
PourchotSigaud
2019-05-01
Towards Combining On-Off-Policy Methods for Real-World Applications
Kai-Chun HuChen-Huan PiTing Han WeiI-Chen WuStone ChengYi-Wei DaiWei-Yuan Ye
2019-04-24
Personalized Cancer Chemotherapy Schedule: a numerical comparison of performance and robustness in model-based and model-free scheduling methodologies
Jesus TordesillasJuncal Arbelaiz
2019-04-02
Deep Reinforcement Learning with Feedback-based Exploration
| Jan ScholtenDaan WoutCarlos CeleminJens Kober
2019-03-14
Complementary Learning for Overcoming Catastrophic Forgetting Using Experience Replay
Mohammad RostamiSoheil KolouriPraveen K. Pilly
2019-03-11
Asynchronous Episodic Deep Deterministic Policy Gradient: Towards Continuous Control in Computationally Complex Environments
| Zhizheng ZhangJiale ChenZhibo ChenWeiping Li
2019-03-03
Deep Reinforcement Learning using Genetic Algorithm for Parameter Optimization
| Adarsh SehgalHung Manh LaSushil J. LouisHai Nguyen
2019-02-19
CrossNorm: Normalization for Off-Policy TD Reinforcement Learning
Aditya BhattMax ArgusArtemij AmiranashviliThomas Brox
2019-02-14
ACTRCE: Augmenting Experience via Teacher's Advice For Multi-Goal Reinforcement Learning
Harris ChanYuhuai WuJamie KirosSanja FidlerJimmy Ba
2019-02-12
Competitive Experience Replay
Hao LiuAlexander TrottRichard SocherCaiming Xiong
2019-02-01
Addressing Sample Complexity in Visual Tasks Using HER and Hallucinatory GANs
| Himanshu SahniToby BuckleyPieter AbbeelIlya Kuzovkin
2019-01-31
Reward Shaping via Meta-Learning
Haosheng ZouTongzheng RenDong YanHang SuJun Zhu
2019-01-27
Power Allocation in Multi-User Cellular Networks: Deep Reinforcement Learning Approaches
Fan MengPeng ChenLenan WuJulian Cheng
2019-01-22
On-Policy Trust Region Policy Optimisation with Replay Buffers
| Dmitry KanginNicolas Pugeault
2019-01-18
Transfer Learning for Prosthetics Using Imitation Learning
Montaser MohammedalamenWaleed D. KhamiesBenjamin Rosman
2019-01-15
A Theoretical Analysis of Deep Q-Learning
Jianqing FanZhaoran WangYuchen XieZhuoran Yang
2019-01-01
Double Deep Q-Learning for Optimal Execution
Brian NingFranco Ho Ting LinSebastian Jaimungal
2018-12-17
Decentralized Computation Offloading for Multi-User Mobile Edge Computing: A Deep Reinforcement Learning Approach
| Zhao ChenXiaodong Wang
2018-12-16
Soft Actor-Critic Algorithms and Applications
| Tuomas HaarnojaAurick ZhouKristian HartikainenGeorge TuckerSehoon HaJie TanVikash KumarHenry ZhuAbhishek GuptaPieter AbbeelSergey Levine
2018-12-13
Off-Policy Deep Reinforcement Learning without Exploration
| Scott FujimotoDavid MegerDoina Precup
2018-12-07
Deep Reinforcement Learning and the Deadly Triad
Hado van HasseltYotam DoronFlorian StrubMatteo HesselNicolas SonneratJoseph Modayil
2018-12-06
Resource Constrained Deep Reinforcement Learning
Abhinav BhatiaPradeep VarakanthamAkshat Kumar
2018-12-03
Deep Multi-Agent Reinforcement Learning with Relevance Graphs
| Aleksandra MalyshevaTegg Taekyong SungChae-Bong SohnDaniel KudenkoAleksei Shpilman
2018-11-30
Experience Replay for Continual Learning
David RolnickArun AhujaJonathan SchwarzTimothy P. LillicrapGreg Wayne
2018-11-28
Deep Reinforcement Learning for Autonomous Driving
Sen WangDaoyuan JiaXinshuo Weng
2018-11-28
Learning with Stochastic Guidance for Navigation
Linhai XieYishu MiaoSen WangPhil BlunsomZhihua WangChanghao ChenAndrew MarkhamNiki Trigoni
2018-11-27
Intelligent Inverse Treatment Planning via Deep Reinforcement Learning, a Proof-of-Principle Study in High Dose-rate Brachytherapy for Cervical Cancer
Chenyang ShenYesenia GonzalezPeter KlagesNan QinHyunuk JungLiyuan ChenDan NguyenSteve B. JiangXun Jia
2018-11-25
Modelling the Dynamic Joint Policy of Teammates with Attention Multi-agent DDPG
Hangyu MaoZhengchao ZhangZhen XiaoZhibo Gong
2018-11-13
ACE: An Actor Ensemble Algorithm for Continuous Control with Tree Search
| Shangtong ZhangHao ChenHengshuai Yao
2018-11-06
Learning to Learn without Forgetting by Maximizing Transfer and Minimizing Interference
| Matthew RiemerIgnacio CasesRobert AjemianMiao LiuIrina RishYuhai TuGerald Tesauro
2018-10-29
Reconciling $λ$-Returns with Experience Replay
| Brett DaleyChristopher Amato
2018-10-23
Hierarchical Approaches for Reinforcement Learning in Parameterized Action Space
Ermo WeiDrew WickeSean Luke
2018-10-23
Parametrized Deep Q-Networks Learning: Reinforcement Learning with Discrete-Continuous Hybrid Action Space
| Jiechao XiongQing WangZhuoran YangPeng SunLei HanYang ZhengHaobo FuTong ZhangJi LiuHan Liu
2018-10-10
Energy-Based Hindsight Experience Prioritization
| Rui ZhaoVolker Tresp
2018-10-02
Hierarchical Deep Multiagent Reinforcement Learning with Temporal Abstraction
| Hongyao TangJianye HaoTangjie LvYingfeng ChenZongzhang ZhangHangtian JiaChunxu RenYan ZhengZhaopeng MengChangjie FanLi Wang
2018-09-25
Dynamic Weights in Multi-Objective Deep Reinforcement Learning
| Axel AbelsDiederik M. RoijersTom LenaertsAnn NowéDenis Steckelmacher
2018-09-20
Generalizing Across Multi-Objective Reward Functions in Deep Reinforcement Learning
Eli FriedmanFred Fontaine
2018-09-17
Curriculum goal masking for continuous deep reinforcement learning
Manfred EppeSven MaggStefan Wermter
2018-09-17
Improvements on Hindsight Learning
Ameet DeshpandeSrikanth SarmaAshutosh JhaBalaraman Ravindran
2018-09-16
Deep Learning with Experience Ranking Convolutional Neural Network for Robot Manipulator
Hai NguyenHung Manh LaMatthew Deans
2018-09-16
Learning Adaptive Display Exposure for Real-Time Advertising
Weixun WangJunqi JinJianye HaoChunjie ChenChuan YuWeinan ZhangJun WangXiaotian HaoYixi WangHan LiJian XuKun Gai
2018-09-10
ARCHER: Aggressive Rewards to Counter bias in Hindsight Experience Replay
Sameera LankaTianfu Wu
2018-09-06
Adversarial Deep Reinforcement Learning in Portfolio Management
| Zhipeng LiangHao ChenJunhao ZhuKangkang JiangYanran Li
2018-08-29
Goal-oriented Dialogue Policy Learning from Failures
Keting LuShiqi ZhangXiaoping Chen
2018-08-20
Remember and Forget for Experience Replay
| Guido NovatiPetros Koumoutsakos
2018-07-16
Bipedal Walking Robot using Deep Deterministic Policy Gradient
| Arun KumarNavneet PaulS N Omkar
2018-07-16
Deterministic Policy Gradients With General State Transitions
Qingpeng CaiLing PanPingzhong Tang
2018-07-10
Learning to Explore via Meta-Policy Gradient
Tianbing XuQiang LiuLiang ZhaoJian Peng
2018-07-01
Organizing Experience: A Deeper Look at Replay Mechanisms for Sample-based Planning in Continuous State Domains
Yangchen PanMuhammad ZaheerAdam WhiteAndrew PattersonMartha White
2018-06-12
Randomized Value Functions via Multiplicative Normalizing Flows
Ahmed TouatiHarsh SatijaJoshua RomoffJoelle PineauPascal Vincent
2018-06-06
Sample-Efficient Deep Reinforcement Learning via Episodic Backward Update
| Su Young LeeSungik ChoiSae-Young Chung
2018-05-31
Deep Reinforcement Learning in a Handful of Trials using Probabilistic Dynamics Models
| Kurtland ChuaRoberto CalandraRowan McAllisterSergey Levine
2018-05-30
Advances in Experience Replay
Tracy WanNeil Xu
2018-05-15
Do deep reinforcement learning agents model intentions?
| Tambet MatiisenAqeel LabashDaniel MajoralJaan AruRaul Vicente
2018-05-15
Metatrace Actor-Critic: Online Step-size Tuning by Meta-gradient Descent for Reinforcement Learning Control
Kenny YoungBaoxiang WangMatthew E. Taylor
2018-05-10
Multiagent Soft Q-Learning
Ermo WeiDrew WickeDavid FreelanSean Luke
2018-04-25
State Distribution-aware Sampling for Deep Q-learning
Weichao LiFuxian HuangXi LiGang PanFei Wu
2018-04-23
Learning to Explore with Meta-Policy Gradient
Tianbing XuQiang LiuLiang ZhaoJian Peng
2018-03-13
Distributed Prioritized Experience Replay
| Dan HorganJohn QuanDavid BuddenGabriel Barth-MaronMatteo HesselHado van HasseltDavid Silver
2018-03-02
Addressing Function Approximation Error in Actor-Critic Methods
| Scott FujimotoHerke van HoofDavid Meger
2018-02-26
Multi-Goal Reinforcement Learning: Challenging Robotics Environments and Request for Research
| Matthias PlappertMarcin AndrychowiczAlex RayBob McGrewBowen BakerGlenn PowellJonas SchneiderJosh TobinMaciek ChociejPeter WelinderVikash KumarWojciech Zaremba
2018-02-26
Weighted Double Deep Multiagent Reinforcement Learning in Stochastic Cooperative Environments
Yan ZhengJianye HaoZongzhang Zhang
2018-02-23
Continual Reinforcement Learning with Complex Synapses
Christos KaplanisMurray ShanahanClaudia Clopath
2018-02-20
A Deep Q-Learning Agent for the L-Game with Variable Batch Training
Petros GiannakopoulosYannis Cotronis
2018-02-17
GEP-PG: Decoupling Exploration and Exploitation in Deep Reinforcement Learning Algorithms
| Cédric ColasOlivier SigaudPierre-Yves Oudeyer
2018-02-14
Efficient Exploration through Bayesian Deep Q-Networks
| Kamyar AzizzadenesheliAnimashree Anandkumar
2018-02-13
Sample Efficient Deep Reinforcement Learning for Dialogue Systems with Large Action Spaces
Gellért WeiszPaweł BudzianowskiPei-Hao SuMilica Gašić
2018-02-11
IMPALA: Scalable Distributed Deep-RL with Importance Weighted Actor-Learner Architectures
| Lasse EspeholtHubert SoyerRemi MunosKaren SimonyanVolodymir MnihTom WardYotam DoronVlad FiroiuTim HarleyIain DunningShane LeggKoray Kavukcuoglu
2018-02-05
Pretraining Deep Actor-Critic Reinforcement Learning Algorithms With Expert Demonstrations
Xiaoqin ZhangHuimin Ma
2018-01-31
Deep In-GPU Experience Replay
Ben Parr
2018-01-09
Faster Deep Q-learning using Neural Episodic Control
Daichi NishioSatoshi Yamane
2018-01-06
Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor
| Tuomas HaarnojaAurick ZhouPieter AbbeelSergey Levine
2018-01-04
ScreenerNet: Learning Self-Paced Curriculum for Deep Neural Networks
Tae-Hoon KimJonghyun Choi
2018-01-03
ViZDoom: DRQN with Prioritized Experience Replay, Double-Q Learning, & Snapshot Ensembling
Christopher SchulzeMarcus Schulze
2018-01-03
A Deeper Look at Experience Replay
| Shangtong ZhangRichard S. Sutton
2017-12-04
AMBER: Adaptive Multi-Batch Experience Replay for Continuous Action Control
Seungyul HanYoungchul Sung
2017-10-12
A novel DDPG method with prioritized experience replay
Yuenan HouLifeng LiuQing WeiXudong XuChunlin Chen
2017-10-01
Overcoming Exploration in Reinforcement Learning with Demonstrations
Ashvin NairBob McGrewMarcin AndrychowiczWojciech ZarembaPieter Abbeel
2017-09-28
Linear Stochastic Approximation: Constant Step-Size and Iterate Averaging
Chandrashekar LakshminarayananCsaba Szepesvári
2017-09-12
Improving Stochastic Policy Gradients in Continuous Control with Deep Reinforcement Learning using the Beta Distribution
Po-Wei ChouDaniel MaturanaSebastian Scherer
2017-08-01
Leveraging Demonstrations for Deep Reinforcement Learning on Robotics Problems with Sparse Rewards
| Mel VecerikTodd HesterJonathan ScholzFumin WangOlivier PietquinBilal PiotNicolas HeessThomas RothörlThomas LampeMartin Riedmiller
2017-07-27
Lenient Multi-Agent Deep Reinforcement Learning
| Gregory PalmerKarl TuylsDaan BloembergenRahul Savani
2017-07-14
The Intentional Unintentional Agent: Learning to Solve Many Continuous Control Tasks Simultaneously
Serkan CabiSergio Gómez ColmenarejoMatthew W. HoffmanMisha DenilZiyu WangNando de Freitas
2017-07-11
Hindsight Experience Replay
| Marcin AndrychowiczFilip WolskiAlex RayJonas SchneiderRachel FongPeter WelinderBob McGrewJosh TobinPieter AbbeelWojciech Zaremba
2017-07-05
Sample-efficient Actor-Critic Reinforcement Learning with Supervised Data for Dialogue Management
Pei-Hao SuPawel BudzianowskiStefan UltesMilica GasicSteve Young
2017-07-01
Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments
| Ryan LoweYi WuAviv TamarJean HarbPieter AbbeelIgor Mordatch
2017-06-07
Parameter Space Noise for Exploration
| Matthias PlappertRein HouthooftPrafulla DhariwalSzymon SidorRichard Y. ChenXi ChenTamim AsfourPieter AbbeelMarcin Andrychowicz
2017-06-06
Discrete Sequential Prediction of Continuous Actions for Deep RL
Luke MetzJulian IbarzNavdeep JaitlyJames Davidson
2017-05-14
Adaptive Traffic Signal Control: Deep Reinforcement Learning Algorithm with Experience Replay and Target Network
Juntao GaoYulong ShenJia LiuMinoru ItoNorio Shiratori
2017-05-08
Vision-based Robotic Arm Imitation by Human Gesture
Cheng XuanZhiqiang TangJinxin Xu
2017-03-15
Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning
| Jakob FoersterNantas NardelliGregory FarquharTriantafyllos AfourasPhilip H. S. TorrPushmeet KohliShimon Whiteson
2017-02-28
Sample-efficient Deep Reinforcement Learning for Dialog Control
Kavosh AsadiJason D. Williams
2016-12-18
Sample Efficient Actor-Critic with Experience Replay
| Ziyu WangVictor BapstNicolas HeessVolodymyr MnihRemi MunosKoray KavukcuogluNando de Freitas
2016-11-03
Online Contrastive Divergence with Generative Replay: Experience Replay without Storing Data
Decebal Constantin MocanuMaria Torres VegaEric EatonPeter StoneAntonio Liotta
2016-10-18
Actor-critic versus direct policy search: a comparison based on sample complexity
| Arnaud de Froissard de BroissiaOlivier Sigaud
2016-06-29
Continuous Deep Q-Learning with Model-based Acceleration
| Shixiang GuTimothy LillicrapIlya SutskeverSergey Levine
2016-03-02
Prioritized Experience Replay
| Tom SchaulJohn QuanIoannis AntonoglouDavid Silver
2015-11-18
Deep Reinforcement Learning with Double Q-learning
| Hado van HasseltArthur GuezDavid Silver
2015-09-22
Continuous control with deep reinforcement learning
| Timothy P. LillicrapJonathan J. HuntAlexander PritzelNicolas HeessTom ErezYuval TassaDavid SilverDaan Wierstra
2015-09-09
Playing Atari with Deep Reinforcement Learning
| Volodymyr MnihKoray KavukcuogluDavid SilverAlex GravesIoannis AntonoglouDaan WierstraMartin Riedmiller
2013-12-19

Components

COMPONENT TYPE
🤖 No Components Found You can add them if they exist; e.g. Mask R-CNN uses RoIAlign

Categories