Q-Learning

Q-Learning is an off-policy temporal difference control algorithm:

$$Q\left(S_{t}, A_{t}\right) \leftarrow Q\left(S_{t}, A_{t}\right) + \alpha\left[R_{t+1} + \gamma\max_{a}Q\left(S_{t+1}, a\right) - Q\left(S_{t}, A_{t}\right)\right] $$

The learned action-value function $Q$ directly approximates $q_{*}$, the optimal action-value function, independent of the policy being followed.

Source: Sutton and Barto, Reinforcement Learning, 2nd Edition

Latest Papers

PAPER DATE
A New Approach for Tactical Decision Making in Lane Changing: Sample Efficient Deep Q Learning with a Safety Feedback Reward
M. Ugur YavasN. Kemal UreTufan Kumbasar
2020-09-24
Is Q-Learning Provably Efficient? An Extended Analysis
Kushagra RastogiJonathan LeeFabrice Harel-CanadaAditya Joglekar
2020-09-22
Hidden Incentives for Auto-Induced Distributional Shift
David KruegerTegan MaharajJan Leike
2020-09-19
Reinforcement Learning for Dynamic Resource Optimization in 5G Radio Access Network Slicing
Yi ShiYalin E. SagduyuTugba Erpek
2020-09-14
Tactical Decision Making for Emergency Vehicles based on a Combinational Learning Method
Haoyi NiuJianming Hu
2020-09-09
A Hybrid PAC Reinforcement Learning Algorithm
Ashkan ZehfrooshHerbert G. Tanner
2020-09-05
PAC Reinforcement Learning Algorithm for General-Sum Markov Games
Ashkan ZehfrooshHerbert G. Tanner
2020-09-05
Solving the single-track train scheduling problem via Deep Reinforcement Learning
Valerio AgasucciGiorgio GraniLeonardo Lamorgese
2020-09-01
Theory of Deep Q-Learning: A Dynamical Systems Perspective
Arunselvan Ramaswamy
2020-08-25
Table2Charts: Learning Shared Representations for Recommending Charts on Multi-dimensional Data
Mengyu ZhouQingtao LiYuejiang LiShi HanDongmei Zhang
2020-08-24
The reinforcement learning-based multi-agent cooperative approach for the adaptive speed regulation on a metallurgical pickling line
Anna BogomolovaKseniia KingsepBoris Voskresenskii
2020-08-16
An adaptive synchronization approach for weights of deep reinforcement learning
S. Amirreza BadranMansoor Rezghi
2020-08-16
Reinforcement Learning with Quantum Variational Circuits
Owen LockwoodMei Si
2020-08-15
Chrome Dino Run using Reinforcement Learning
Divyanshu MarwahSneha SrivastavaAnusha GuptaShruti Verma
2020-08-15
Decision-making at Unsignalized Intersection for Autonomous Vehicles: Left-turn Maneuver with Deep Reinforcement Learning
Teng LiuXingyu MuBing HuangXiaolin TangFuqing ZhaoXiao WangDongpu Cao
2020-08-14
Caching Placement and Resource Allocation for Cache-Enabling UAV NOMA Networks
Tiankui ZhangZiduan WangYuanwei LiuWenjun XuArumugam Nallanathan
2020-08-12
Convex Q-Learning, Part 1: Deterministic Optimal Control
Prashant G. MehtaSean P. Meyn
2020-08-08
Deep Q-Network Based Multi-agent Reinforcement Learning with Binary Action Agents
Abdul Mueed HafizGhulam Mohiuddin Bhat
2020-08-06
Deep Inverse Q-learning with Constraints
Gabriel KalweitMaria HuegleMoritz WerlingJoschka Boedecker
2020-08-04
Cooperative Control of Mobile Robots with Stackelberg Learning
Joewie J. KohGuohui DingChristoffer HeckmanLijun ChenAlessandro Roncone
2020-08-03
QPLEX: Duplex Dueling Multi-Agent Q-Learning
Jianhao WangZhizhou RenTerry LiuYang YuChongjie Zhang
2020-08-03
Using Reinforcement Learning to Perform Qubit Routing in Quantum Compilers
Matteo G. PozziSteven J. HerbertAkash SenguptaRobert D. Mullins
2020-07-31
Momentum Q-learning with Finite-Sample Convergence Guarantee
Bowen WengHuaqing XiongLin ZhaoYingbin LiangWei Zhang
2020-07-30
Variance Reduction for Deep Q-Learning using Stochastic Recursive Gradient
Haonan JiaXiao ZhangJun XuWei ZengHao JiangXiaohui YanJi-Rong Wen
2020-07-25
A Comparative Study of AI-based Intrusion Detection Techniques in Critical Infrastructures
Safa OtoumBurak KantarciHussein Mouftah
2020-07-24
UAV Target Tracking in Urban Environments Using Deep Reinforcement Learning
Sarthak BhagatSujit PB
2020-07-21
Deep vs. Deep Bayesian: Reinforcement Learning on a Multi-Robot Competitive Experiment
Jingyi HuangAndre Rosendo
2020-07-21
EMaQ: Expected-Max Q-Learning Operator for Simple Yet Effective Offline and Online RL
Seyed Kamyar Seyed GhasemipourDale SchuurmansShixiang Shane Gu
2020-07-21
A Machine Learning Approach for Task and Resource Allocation in Mobile Edge Computing Based Networks
Sihua WangMingzhe ChenXuanlin LiuChangchuan YinShuguang CuiH. Vincent Poor
2020-07-20
Multi-agent Reinforcement Learning in Bayesian Stackelberg Markov Games for Adaptive Moving Target Defense
Sailik SenguptaSubbarao Kambhampati
2020-07-20
PC-PG: Policy Cover Directed Exploration for Provable Policy Gradient Learning
Alekh AgarwalMikael HenaffSham KakadeWen Sun
2020-07-16
Meta-Gradient Reinforcement Learning with an Objective Discovered Online
Zhongwen XuHado van HasseltMatteo HesselJunhyuk OhSatinder SinghDavid Silver
2020-07-16
Mixture of Step Returns in Bootstrapped DQN
Po-Han ChiangHsuan-Kung YangZhang-Wei HongChun-Yi Lee
2020-07-16
DRIFT: Deep Reinforcement Learning for Functional Software Testing
Luke HarriesRebekah Storan ClarkeTimothy ChapmanSwamy V. P. L. N. NallamalliLevent OzgurShuktika JainAlex LeungSteve LimAaron DietrichJosé Miguel Hernández-LobatoTom EllisCheng ZhangKamil Ciosek
2020-07-16
Reinforcement Learning-Enabled Decision-Making Strategies for a Vehicle-Cyber-Physical-System in Connected Environment
Teng LiuXiaolin TangJinwei ZhangWenbo LiZejian DengYalian Yang
2020-07-16
Qgraph-bounded Q-learning: Stabilizing Model-Free Off-Policy Deep Reinforcement Learning
Sabrina HoppeMarc Toussaint
2020-07-15
Analysis of Q-learning with Adaptation and Momentum Restart for Gradient Descent
Bowen WengHuaqing XiongYingbin LiangWei Zhang
2020-07-15
Single-partition adaptive Q-learning
| João Pedro AraújoMário FigueiredoMiguel Ayala Botto
2020-07-14
Revisiting Fundamentals of Experience Replay
| William FedusPrajit RamachandranRishabh AgarwalYoshua BengioHugo LarochelleMark RowlandWill Dabney
2020-07-13
Simulating multi-exit evacuation using deep reinforcement learning
Dong XuXiao HuangJoseph MangoXiang LiZhenlong Li
2020-07-11
SUNRISE: A Simple Unified Framework for Ensemble Learning in Deep Reinforcement Learning
| Kimin LeeMichael LaskinAravind SrinivasPieter Abbeel
2020-07-09
Provably-Efficient Double Q-Learning
Wentao WengHarsh GuptaNiao HeLei YingR. Srikant
2020-07-09
Auto-MAP: A DQN Framework for Exploring Distributed Execution Plans for DNN Workloads
Siyu WangYi RongShiqing FanZhen ZhengLanSong DiaoGuoping LongJun YangXiaoyong LiuWei Lin
2020-07-08
Cognitive Radio Network Throughput Maximization with Deep Reinforcement Learning
Kevin Shen Hoong OngYang ZhangDusit Niyato
2020-07-07
Neural Interactive Collaborative Filtering
Lixin ZouLong XiaYulong GuXiangyu ZhaoWeidong LiuJimmy Xiangji HuangDawei Yin
2020-07-04
Reward Machines for Cooperative Multi-Agent Reinforcement Learning
Cyrus NearyZhe XuBo WuUfuk Topcu
2020-07-03
Decentralized Deep Reinforcement Learning for Network Level Traffic Signal Control
Jin Guo
2020-07-02
Gradient Temporal-Difference Learning with Regularized Corrections
| Sina GhiassianAndrew PattersonShivam GargDhawal GuptaAdam WhiteMartha White
2020-07-01
Regularly Updated Deterministic Policy Gradient Algorithm
Shuai HanWenbo ZhouShuai LüJiayu Yu
2020-07-01
Group Equivariant Deep Reinforcement Learning
| Arnab Kumar MondalPratheeksha NairKaleem Siddiqi
2020-07-01
Provably More Efficient Q-Learning in the Full-Feedback/One-Sided-Feedback Settings
Xiao-Yue GongDavid Simchi-Levi
2020-06-30
Concept and the implementation of a tool to convert industry 4.0 environments modeled as FSM to an OpenAI Gym wrapper
Kallil M. C. ZielinskiMarcelo TeixeiraRichardson RibeiroDalcimar Casanova
2020-06-29
Using Reinforcement Learning to Herd a Robotic Swarm to a Target Distribution
Zahi M. KakishKarthik ElamvazhuthiSpring Berman
2020-06-29
Lookahead-Bounded Q-Learning
| Ibrahim El SharDaniel R. Jiang
2020-06-28
Reinforcement Learning Based Handwritten Digit Recognition with Two-State Q-Learning
Abdul Mueed HafizGhulam Mohiuddin Bhat
2020-06-28
Image Classification by Reinforcement Learning with Two-State Q-Learning
Abdul Mueed HafizGhulam Mohiuddin Bhat
2020-06-28
Overfitting and Optimization in Offline Policy Learning
David BrandfonbrenerWilliam F. WhitneyRajesh RanganathJoan Bruna
2020-06-27
Q-Learning with Differential Entropy of Q-Tables
Tung D. NguyenKathryn E. KasmarikHussein A. Abbass
2020-06-26
Noise, overestimation and exploration in Deep Reinforcement Learning
Rafael Stekolshchik
2020-06-25
Reducing Overestimation Bias by Increasing Representation Dissimilarity in Ensemble Based Deep Q-Learning
Hassam Ullah SheikhLadislau Bölöni
2020-06-24
RL Unplugged: Benchmarks for Offline Reinforcement Learning
| Caglar GulcehreZiyu WangAlexander NovikovTom Le PaineSergio Gomez ColmenarejoKonrad ZolnaRishabh AgarwalJosh MerelDaniel MankowitzCosmin PaduraruGabriel Dulac-ArnoldJerry LiMohammad NorouziMatt HoffmanOfir NachumGeorge TuckerNicolas HeessNando de Freitas
2020-06-24
Deep Reinforcement Learning Control for Radar Detection and Tracking in Congested Spectral Environments
Charles E. ThorntonMark A. KozyR. Michael BuehrerAnthony F. MartoneKelly D. Sherbondy
2020-06-23
Risk-Sensitive Reinforcement Learning: Near-Optimal Risk-Sample Tradeoff in Regret
Yingjie FeiZhuoran YangYudong ChenZhaoran WangQiaomin Xie
2020-06-22
NROWAN-DQN: A Stable Noisy Network with Noise Reduction and Online Weight Adjustment for Exploration
Shuai HanWenbo ZhouJing LiuShuai Lü
2020-06-19
Efficient Ridesharing Dispatch Using Multi-Agent Reinforcement Learning
| Oscar de LimaHansal ShahTing-Sheng ChuBrian Fogelson
2020-06-18
Semantic Visual Navigation by Watching YouTube Videos
Matthew ChangArjun GuptaSaurabh Gupta
2020-06-17
Parameterized MDPs and Reinforcement Learning Problems -- A Maximum Entropy Principle Based Framework
Amber SrivastavaSrinivasa M Salapaka
2020-06-17
The Teaching Dimension of Q-learning
Xuezhou ZhangShubham Kumar BhartiYuzhe MaAdish SinglaXiaojin Zhu
2020-06-16
Interaction Networks: Using a Reinforcement Learner to train other Machine Learning algorithms
Florian Dietz
2020-06-15
Deep Reinforcement Learning for Neural Control
Jimin KimEli Shlizerman
2020-06-12
Human and Multi-Agent collaboration in a human-MARL teaming framework
Neda NavidiFrancois ChabotSagar KurandwadIrv LustigmanVincent RobertGregory SzriftgiserAndrea Schuch
2020-06-12
Decorrelated Double Q-learning
Gang Chen
2020-06-12
Safety-guaranteed Reinforcement Learning based on Multi-class Support Vector Machine
Kwangyeon KimAkshita GuptaHong-Cheol ChoiInseok Hwang
2020-06-12
Self-Imitation Learning via Generalized Lower Bound Q-learning
Yunhao Tang
2020-06-12
Exploration by Maximizing Rényi Entropy for Zero-Shot Meta RL
Chuheng ZhangYuanying CaiLongbo HuangJian Li
2020-06-11
Model-Free Algorithm and Regret Analysis for MDPs with Long-Term Constraints
Qinbo BaiVaneet AggarwalAther Gattami
2020-06-10
Fitted Q-Learning for Relational Domains
Srijita DasSriraam NatarajanKaushik RoyRonald ParrKristian Kersting
2020-06-10
Reinforcement Learning-Based Joint Self-Optimisation Method for the Fuzzy Logic Handover Algorithm in 5G HetNets
Qianyu LiuChiew Foong KwongSun WeiLincan LiJing Wang
2020-06-09
Conservative Q-Learning for Offline Reinforcement Learning
Aviral KumarAurick ZhouGeorge TuckerSergey Levine
2020-06-08
Can Temporal-Difference and Q-Learning Learn Representation? A Mean-Field Theory
Yufeng ZhangQi CaiZhuoran YangYongxin ChenZhaoran Wang
2020-06-08
A Model-free Learning Algorithm for Infinite-horizon Average-reward MDPs with Near-optimal Regret
Mehdi Jafarnia-JahromiChen-Yu WeiRahul JainHaipeng Luo
2020-06-08
Balancing a CartPole System with Reinforcement Learning -- A Tutorial
Swagat Kumar
2020-06-08
A Multi-step and Resilient Predictive Q-learning Algorithm for IoT with Human Operators in the Loop: A Case Study in Water Supply Networks
Maria GrammatopoulouAris KanellopoulosKyriakos G. ~VamvoudakisNathan Lau
2020-06-06
Sample Complexity of Asynchronous Q-Learning: Sharper Analysis and Variance Reduction
Gen LiYuting WeiYuejie ChiYuantao GuYuxin Chen
2020-06-04
A Novel Update Mechanism for Q-Networks Based On Extreme Learning Machines
Callum WilsonAnnalisa RiccardiEdmondo Minisci
2020-06-04
Mitigating Bias in Face Recognition Using Skewness-Aware Reinforcement Learning
Mei Wang Weihong Deng
2020-06-01
Towards Understanding Linear Value Decomposition in Cooperative Multi-Agent Q-Learning
Jianhao WangZhizhou RenBeining HanChongjie Zhang
2020-05-31
Active Measure Reinforcement Learning for Observation Cost Minimization
Colin BellingerRory ColesMark CrowleyIsaac Tamblyn
2020-05-26
Should artificial agents ask for help in human-robot collaborative problem-solving?
Adrien BennetotVicky CharisiNatalia Díaz-Rodríguez
2020-05-25
Learning to Charge RF-Energy Harvesting Devices in WiFi Networks
Yizhou LuoKwan-Wu Chin
2020-05-25
A reinforcement learning based decision support system in textile manufacturing process
Zhenglei HeKim Phuc TranSébastien ThomasseyXianyi ZengChanghai Yi
2020-05-20
Prototypical Q Networks for Automatic Conversational Diagnosis and Few-Shot New Disease Adaption
Hongyin LuoShang-Wen LiJames Glass
2020-05-19
Local and Global Explanations of Agent Behavior: Integrating Strategy Summaries with Saliency Maps
| Tobias HuberKatharina WeitzElisabeth AndréOfra Amir
2020-05-18
Basal Glucose Control in Type 1 Diabetes using Deep Reinforcement Learning: An In Silico Validation
Taiyu ZhuKezhi LiPau HerreroPantelis Georgiou
2020-05-18
A Deep Q-learning/genetic Algorithms Based Novel Methodology For Optimizing Covid-19 Pandemic Government Actions
Luis Miralles-PechuánFernando JiménezHiram PonceLourdes Martínez-Villaseñor
2020-05-15
A Deep Reinforcement Learning Approach to Efficient Drone Mobility Support
Yun ChenXingqin LinTalha Ahmed KhanMohammad Mozaffari
2020-05-11
An FPGA-Based On-Device Reinforcement Learning Approach using Online Sequential Learning
Hirohisa WatanabeMineto TsukadaHiroki Matsutani
2020-05-10
Reinforcement Learning for Thermostatically Controlled Loads Control using Modelica and Python
Oleh LukianykhinTetiana Bogodorova
2020-05-09
Optimal Beam Association for High Mobility mmWave Vehicular Networks: Lightweight Parallel Reinforcement Learning Approach
Nguyen Van HuynhDiep N. NguyenDinh Thai HoangEryk Dutkiewicz
2020-05-02
Learning Efficient Parameter Server Synchronization Policies for Distributed SGD
Rong ZhuSheng YangAndreas PfadlerZhengping QianJingren Zhou
2020-05-01
Implementing Inductive bias for different navigation tasks through diverse RNN attrractors
Tie XUOmri Barak
2020-05-01
Whittle index based Q-learning for restless bandits with average reward
Konstantin AvrachenkovVivek S. Borkar
2020-04-29
Learning Dialog Policies from Weak Demonstrations
Gabriel Gordon-HallPhilip John GorinskiShay B. Cohen
2020-04-23
Spatial Action Maps for Mobile Manipulation
Jimmy WuXingyuan SunAndy ZengShuran SongJohnny LeeSzymon RusinkiewiczThomas Funkhouser
2020-04-20
Show Us the Way: Learning to Manage Dialog from Demonstrations
Gabriel Gordon-HallPhilip John GorinskiGerasimos LampourasIgnacio Iacobacci
2020-04-17
Deep Reinforcement Learning for Adaptive Learning Systems
Xiao LiHanchen XuJinming ZhangHua-hua Chang
2020-04-17
K-spin Hamiltonian for quantum-resolvable Markov decision processes
Eric B. JonesPeter GrafEliot KapitWesley Jones
2020-04-13
Risk-Aware High-level Decisions for Automated Driving at Occluded Intersections with Reinforcement Learning
Danial KamranCarlos Fernandez LopezMartin LauerChristoph Stiller
2020-04-09
An Application of Deep Reinforcement Learning to Algorithmic Trading
Thibaut ThéateDamien Ernst
2020-04-07
Zero-Shot Learning of Text Adventure Games with Sentence-Level Semantics
Xusen YinJonathan May
2020-04-06
Uniform State Abstraction For Reinforcement Learning
John BurdenDaniel Kudenko
2020-04-06
Multi-agent Reinforcement Learning for Resource Allocation in IoT networks with Edge Computing
Xiaolan LiuJiadong YuYue Gao
2020-04-05
Minimizing Age-of-Information for Fog Computing-supported Vehicular Networks with Deep Q-learning
Maohong ChenYong XiaoQiang LiKwang-cheng Chen
2020-04-04
Reinforcement Learning for Mixed-Integer Problems Based on MPC
Sebastien GrosMario Zanon
2020-04-03
Statistically Model Checking PCTL Specifications on Markov Decision Processes via Reinforcement Learning
Yu WangNima RoohiMatthew WestMahesh ViswanathanGeir E. Dullerud
2020-04-01
Enhanced Rolling Horizon Evolution Algorithm with Opponent Model Learning: Results for the Fighting Game AI Competition
Zhentao TangYuanheng ZhuDongbin ZhaoSimon M. Lucas
2020-03-31
Robust Q-learning
Ashkan ErtefaieJames R. McKayDavid OslinRobert L. Strawderman
2020-03-27
Using Deep Reinforcement Learning Methods for Autonomous Vessels in 2D Environments
| Mohammad EtemadNader ZareMahtab SarvmailiAmilcar SoaresBruno Brandoli MachadoStan Matwin
2020-03-23
Importance of using appropriate baselines for evaluation of data-efficiency in deep reinforcement learning for Atari
Kacper Kielak
2020-03-23
FlapAI Bird: Training an Agent to Play Flappy Bird Using Reinforcement Learning Techniques
| Tai VuLeon Tran
2020-03-21
Distributed Reinforcement Learning for Cooperative Multi-Robot Object Manipulation
Guohui DingJoewie J. KohKelly MerckaertBram VanderborghtMarco M. NicotraChristoffer HeckmanAlessandro RonconeLijun Chen
2020-03-21
Deep Reinforcement Learning with Weighted Q-Learning
Andrea CiniCarlo D'EramoJan PetersCesare Alippi
2020-03-20
Interpretable Multi Time-scale Constraints in Model-free Deep Reinforcement Learning for Autonomous Driving
Gabriel KalweitMaria HuegleMoritz WerlingJoschka Boedecker
2020-03-20
Robust Deep Reinforcement Learning against Adversarial Perturbations on State Observations
| Huan ZhangHongge ChenChaowei XiaoBo LiMingyan LiuDuane BoningCho-Jui Hsieh
2020-03-19
Simultaneous Navigation and Radio Mapping for Cellular-Connected UAV with Deep Reinforcement Learning
| Yong ZengXiaoli XuShi JinRui Zhang
2020-03-17
DisCor: Corrective Feedback in Reinforcement Learning via Distribution Correction
| Aviral KumarAbhishek GuptaSergey Levine
2020-03-16
Active Perception and Representation for Robotic Manipulation
Youssef ZakyGaurav ParuthiBryan TrippJames Bergstra
2020-03-15
A General Framework for Learning Mean-Field Games
Xin GuoAnran HuRenyuan XuJunzi Zhang
2020-03-13
Application of Deep Q-Network in Portfolio Management
Ziming GaoYuan GaoYi HuZhengyong JiangJionglong Su
2020-03-13
Model-Free Algorithm and Regret Analysis for MDPs with Peak Constraints
Qinbo BaiVaneet AggarwalAther Gattami
2020-03-11
Behavior Planning For Connected Autonomous Vehicles Using Feedback Deep Reinforcement Learning
Songyang HanFei Miao
2020-03-09
Software-Level Accuracy Using Stochastic Computing With Charge-Trap-Flash Based Weight Matrix
Varun BhattShalini ShrivastavaTanmay ChavanUdayan Ganguly
2020-03-09
Dynamic Experience Replay
Jieliang LuoHui Li
2020-03-04
Self-Supervised Object-Level Deep Reinforcement Learning
William AgnewPedro Domingos
2020-03-03
Reinforcement Learning in FlipIt
Laura GreigePeter Chin
2020-02-28
ConQUR: Mitigating Delusional Bias in Deep Q-learning
Andy SuJayden OoiTyler LuDale SchuurmansCraig Boutilier
2020-02-27
Optimistic Exploration even with a Pessimistic Initialisation
| Tabish RashidBei PengWendelin BöhmerShimon Whiteson
2020-02-26
G-Learner and GIRL: Goal Based Wealth Management with Reinforcement Learning
Matthew DixonIgor Halperin
2020-02-25
Q-learning with Uniformly Bounded Variance: Large Discounting is Not a Barrier to Fast Learning
Adithya M. DevrajSean P. Meyn
2020-02-24
Millimeter Wave Communications with an Intelligent Reflector: Performance Optimization and Distributional Reinforcement Learning
Qianqian ZhangWalid SaadMehdi Bennis
2020-02-24
A Double Q-Learning Approach for Navigation of Aerial Vehicles with Connectivity Constraint
Behzad KhamidehiElvino S. Sousa
2020-02-24
Periodic Q-Learning
Donghwan LeeNiao He
2020-02-23
Anypath Routing Protocol Design via Q-Learning for Underwater Sensor Networks
Yuan ZhouTao CaoWei Xiang
2020-02-22
Disentangling Controllable Object through Video Prediction Improves Visual Reinforcement Learning
Yuanyi ZhongAlexander SchwingJian Peng
2020-02-21
Langevin DQN
Vikranth DwaracherlaBenjamin Van Roy
2020-02-17
Maxmin Q-learning: Controlling the Estimation Bias of Q-learning
| Qingfeng LanYangchen PanAlona FysheMartha White
2020-02-16
Reinforced active learning for image segmentation
Arantxa CasanovaPedro O. PinheiroNegar RostamzadehChristopher J. Pal
2020-02-16
A Multimodal Dialogue System for Conversational Image Editing
Tzu-Hsiang LinTrung BuiDoo Soon KimJean Oh
2020-02-16
Listwise Learning to Rank with Deep Q-Networks
Abhishek Sharma
2020-02-13
Fast Reinforcement Learning for Anti-jamming Communications
Pei-Gen YeYuan-Gen WangJin LiLiang Xiao
2020-02-13
$γ$-Regret for Non-Episodic Reinforcement Learning
Shuang LiuHao Su
2020-02-12
Q-Learning Algorithm for Mean-Field Controls, with Convergence and Complexity Analysis
Haotian GuXin GuoXiaoli WeiRenyuan Xu
2020-02-10
Learning State Abstractions for Transfer in Continuous Control
| Kavosh AsadiDavid AbelMichael L. Littman
2020-02-08
Safe Wasserstein Constrained Deep Q-Learning
Aaron KandelScott J. Moura
2020-02-07
A Stochastic Game Framework for Efficient Energy Management in Microgrid Networks
Shravan NayakChanakya Ajit EkboteAnnanya Pratap Singh ChauhanRaghuram Bharadwaj DiddigiPrishita RayAbhinava SikdarSai Koti Reddy DandaShalabh Bhatnagar
2020-02-06
Deep RBF Value Functions for Continuous Control
Kavosh AsadiRonald E. ParrGeorge D. KonidarisMichael L. Littman
2020-02-05
Interpretable End-to-end Urban Autonomous Driving with Latent Deep Reinforcement Learning
| Jianyu ChenShengbo Eben LiMasayoshi Tomizuka
2020-01-23
Q-Learning in enormous action spaces via amortized approximate maximization
Tom Van de WieleDavid Warde-FarleyAndriy MnihVolodymyr Mnih
2020-01-22
Discriminator Soft Actor Critic without Extrinsic Rewards
| Daichi NishioDaiki KuyoshiToi TsunedaSatoshi Yamane
2020-01-19
Model-based Multi-Agent Reinforcement Learning with Cooperative Prioritized Sweeping
Eugenio BargiacchiTimothy VerstraetenDiederik M. RoijersAnn Nowé
2020-01-15
Deep Interactive Reinforcement Learning for Path Following of Autonomous Underwater Vehicle
Qilei ZhangJinying LinQixin ShaBo HeGuangliang Li
2020-01-10
A Probabilistic Simulator of Spatial Demand for Product Allocation
Porter JenkinsHua WeiJ. Stockton JenkinsZhenhui Li
2020-01-09
EEG-based Drowsiness Estimation for Driving Safety using Deep Q-Learning
Yurui MingDongrui WuYu-Kai WangYuhui ShiChin-Teng Lin
2020-01-08
Experimental Analysis of Reinforcement Learning Techniques for Spectrum Sharing Radar
Charles E. ThorntonR. Michael BuehrerAnthony F. MartoneKelly D. Sherbondy
2020-01-06
Qgraph-bounded Q-learning: Stabilizing Model-Free Off-Policy Deep Reinforcement Learning
Anonymous
2020-01-01
Dynamically Balanced Value Estimates for Actor-Critic Methods
Anonymous
2020-01-01
Implementing Inductive bias for different navigation tasks through diverse RNN attrractors
Anonymous
2020-01-01
Striving for Simplicity in Off-Policy Deep Reinforcement Learning
Anonymous
2020-01-01
Learning Efficient Parameter Server Synchronization Policies for Distributed SGD
Anonymous
2020-01-01
Long-term planning, short-term adjustments
Anonymous
2020-01-01
CAN ALTQ LEARN FASTER: EXPERIMENTS AND THEORY
Anonymous
2020-01-01
SVQN: Sequential Variational Soft Q-Learning Networks
Shiyu HuangHang SuJun ZhuTing Chen
2020-01-01
Do recent advancements in model-based deep reinforcement learning really improve data efficiency?
Anonymous
2020-01-01
Deep Randomized Least Squares Value Iteration
Guy AdamTom ZahavyOron AnschelNahum Shimkin
2020-01-01
Way Off-Policy Batch Deep Reinforcement Learning of Human Preferences in Dialog
Natasha JaquesAsma GhandehariounJudy Hanwen ShenCraig FergusonAgata LapedrizaNoah JonesShixiang GuRosalind Picard
2020-01-01
Information Theoretic Model Predictive Q-Learning
Mohak BhardwajAnkur HandaDieter FoxByron Boots
2019-12-31
Learning an Interpretable Traffic Signal Control Policy
James AultJosiah P. HannaGuni Sharon
2019-12-23
Hamilton-Jacobi-Bellman Equations for Q-Learning in Continuous Time
Jeongho KimInsoon Yang
2019-12-23
Exploiting the potential of deep reinforcement learning for classification tasks in high-dimensional and unstructured data
Johan S. Obando-CeronVictor Romero CanoWalter Mayor Toro
2019-12-20
Soft Q-network
Jingbin LiuXinyang GuShuai LiuDexiang Zhang
2019-12-20
Sepsis World Model: A MIMIC-based OpenAI Gym "World Model" Simulator for Sepsis Treatment
Amirhossein KianiChris WangAngela Xu
2019-12-15
Provably Efficient Reinforcement Learning with Aggregated States
Shi DongBenjamin Van RoyZhengyuan Zhou
2019-12-13
High dimensional precision medicine from patient-derived xenografts
Naim U. RashidDaniel J. LuckettJingxiang ChenMichael T. LawsonLongshaokan WangYunshu ZhangEric B. LaberYufeng LiuJen Jen YehDonglin ZengMichael R. Kosorok
2019-12-13
A Finite-Time Analysis of Q-Learning with Neural Network Function Approximation
Pan XuQuanquan Gu
2019-12-10
Learning Sparse Representations Incrementally in Deep Reinforcement Learning
J. Fernando Hernandez-GarciaRichard S. Sutton
2019-12-09
Value-of-Information based Arbitration between Model-based and Model-free Control
Krishn BeraYash MandilwarBapi Raju
2019-12-08
Reinforcement Learning with Non-Markovian Rewards
Maor GaonRonen I. Brafman
2019-12-05
Combining Q-Learning and Search with Amortized Value Estimates
Jessica B. HamrickVictor BapstAlvaro Sanchez-GonzalezTobias PfaffTheophane WeberLars BuesingPeter W. Battaglia
2019-12-05
Learning to Dynamically Coordinate Multi-Robot Teams in Graph Attention Networks
Zheyuan WangMatthew Gombolay
2019-12-04
A Unified Switching System Perspective and O.D.E. Analysis of Q-Learning Algorithms
Donghwan LeeNiao He
2019-12-04
Neighborhood Cognition Consistent Multi-Agent Reinforcement Learning
Hangyu MaoWulong LiuJianye HaoJun LuoDong LiZhengchao ZhangJun WangZhen Xiao
2019-12-03
Propagating Uncertainty in Reinforcement Learning via Wasserstein Barycenters
| Alberto Maria MetelliAmarildo LikmetaMarcello Restelli
2019-12-01
Learning Mean-Field Games
Xin GuoAnran HuRenyuan XuJunzi Zhang
2019-12-01
Provably Efficient Q-learning with Function Approximation via Distribution Shift Error Checking Oracle
Simon S. DuYuping LuoRuosong WangHanrui Zhang
2019-12-01
Modelling the Dynamics of Multiagent Q-Learning in Repeated Symmetric Games: a Mean Field Theoretic Approach
Shuyue HuChin-Wing LeungHo-Fung Leung
2019-12-01
Reconciling λ-Returns with Experience Replay
| Brett DaleyChristopher Amato
2019-12-01
Quadratic Q-network for Learning Continuous Control for Autonomous Vehicles
Pin WangHanhan LiChing-Yao Chan
2019-11-29
Join Query Optimization with Deep Reinforcement Learning Algorithms
| Jonas HeitzKurt Stockinger
2019-11-26
Control-Tutored Reinforcement Learning: an application to the Herding Problem
Francesco De LellisFabrizia AulettaGiovanni RussoMario di Bernardo
2019-11-26
Mitigate Bias in Face Recognition using Skewness-Aware Reinforcement Learning
Mei WangWeihong Deng
2019-11-25
Adaptive Modulation and Coding based on Reinforcement Learning for 5G Networks
Mateus P. MotaDaniel C. AraujoFrancisco Hugo Costa NetoAndre L. F. de AlmeidaF. Rodrigo P. Cavalcanti
2019-11-25
Which Channel to Ask My Question? Personalized Customer Service RequestStream Routing using DeepReinforcement Learning
Zining LiuChong LongXiaolu LuZehong HuJie ZhangYafang Wang
2019-11-24
Efficient Drone Mobility Support Using Reinforcement Learning
Yun ChenXingqin LinTalha KhanMohammad Mozaffari
2019-11-21
Quantum Observables for continuous control of the Quantum Approximate Optimization Algorithm via Reinforcement Learning
Artur Garcia-SaezJordi Riu
2019-11-21
Placement Optimization of Aerial Base Stations with Deep Reinforcement Learning
Jin QiuJiangbin LyuLiqun Fu
2019-11-19
Asymptotics of Reinforcement Learning with Neural Networks
Justin SirignanoKonstantinos Spiliopoulos
2019-11-13
Minimalistic Attacks: How Little it Takes to Fool a Deep Reinforcement Learning Policy
| Xinghua QuZhu SunYew-Soon OngAbhishek GuptaPengfei Wei
2019-11-10
Two-stage WECC Composite Load Modeling: A Double Deep Q-Learning Networks Approach
Xinan WangYishen WangDi ShiJianhui WangZhiwei Wang
2019-11-08
An End-to-End Deep RL Framework for Task Arrangement in Crowdsourcing Platforms
Caihua ShanNikos MamoulisReynold ChengGuoliang LiXiang LiYuqiu Qian
2019-11-04
On Solving the 2-Dimensional Greedy Shooter Problem for UAVs
| Loren AndersonSahitya Senapathy
2019-11-02
Challenging On Car Racing Problem from OpenAI gym
Changmao Li
2019-11-02
Generalized Speedy Q-learning
Indu JohnChandramouli KamanchiShalabh Bhatnagar
2019-11-01
Model-Free Mean-Field Reinforcement Learning: Mean-Field MDP and Mean-Field Q-Learning
René CarmonaMathieu LaurièreZongjun Tan
2019-10-28
Biomimetic Ultra-Broadband Perfect Absorbers Optimised with Reinforcement Learning
Trevon BadloeInki KimJunsuk Rho
2019-10-28
Task-Oriented Language Grounding for Language Input with Multiple Sub-Goals of Non-Linear Order
| Vladislav KurenkovBulat MaksudovAdil Khan
2019-10-27
ZPD Teaching Strategies for Deep Reinforcement Learning from Demonstrations
Daniel SeitaDavid ChanRoshan RaoChen TangMandi ZhaoJohn Canny
2019-10-26
D-Point Trigonometric Path Planning based on Q-Learning in Uncertain Environments
Ehsan JeihaninejadAzam Rabiee
2019-10-26
Deep Q-Learning for Same-Day Delivery with a Heterogeneous Fleet of Vehicles and Drones
Xinwei ChenMarlin W. UlmerBarrett W. Thomas
2019-10-25
Momentum in Reinforcement Learning
Nino VieillardBruno ScherrerOlivier PietquinMatthieu Geist
2019-10-21
Resource Allocation in Mobility-Aware Federated Learning Networks: A Deep Reinforcement Learning Approach
Huy T. NguyenNguyen Cong LuongJun ZhaoChau YuenDusit Niyato
2019-10-21
Policy Learning for Malaria Control
| Van Bach NguyenBelaid Mohamed KarimBao Long VuJörg SchlöttererMichael Granitzer
2019-10-20
Reverse Experience Replay
Egor Rotinov
2019-10-19
Automatic Data Augmentation by Learning the Deterministic Policy
| Yinghuan ShiTiexin QinYong LiuJiwen LuYang GaoDinggang Shen
2019-10-18
On the Reduction of Variance and Overestimation of Deep Q-Learning
Mohammed SabryAmr M. A. Khalifa
2019-10-14
Zap Q-Learning With Nonlinear Function Approximation
Shuhang ChenAdithya M. DevrajFan LuAna BušićSean P. Meyn
2019-10-11
Integrating Behavior Cloning and Reinforcement Learning for Improved Performance in Dense and Sparse Reward Environments
Vinicius G. GoecksGregory M. GremillionVernon J. LawhernJohn ValasekNicholas R. Waytowich
2019-10-09
Knowledge Induced Deep Q-Network for a Slide-to-Wall Object Grasping
Hengyue LiangXibai LouChanghyun Choi
2019-10-09
Toward Synergic Learning for Autonomous Manipulation of Deformable Tissues via Surgical Robots: An Approximate Q-Learning Approach
Sahba Aghajani PedramPeter Walker FergusonChangyeob ShinAnkur MehtaErik P. DutsonFarshid AlambeigiJacob Rosen
2019-10-08
Reinforcement Learning with Structured Hierarchical Grammar Representations of Actions
Petros ChristodoulouRobert Tjarko LangeAli ShaftiA. Aldo Faisal
2019-10-07
Multi-step Greedy Reinforcement Learning Algorithms
Manan TomarYonathan EfroniMohammad Ghavamzadeh
2019-10-07
"I'm sorry Dave, I'm afraid I can't do that" Deep Q-learning from forbidden action
Mathieu SeurinPhilippe PreuxOlivier Pietquin
2019-10-04
Deep Q-Network for Angry Birds
Ekaterina NikonovaJakub Gemrot
2019-10-04
Benchmarking Batch Deep Reinforcement Learning Algorithms
| Scott FujimotoEdoardo ContiMohammad GhavamzadehJoelle Pineau
2019-10-03
AI Assisted Annotator using Reinforcement Learning
V. Ratna SaripalliGopal AvinashDibyajyoti PatiMichael PotterCharles W. Anderson
2019-10-02
Quantile QT-Opt for Risk-Aware Vision-Based Robotic Grasping
Cristian BodnarAdrian LiKarol HausmanPeter PastorMrinal Kalakrishnan
2019-10-01
Off-policy Multi-step Q-learning
Gabriel KalweitMaria HuegleJoschka Boedecker
2019-09-30
Meta-Q-Learning
Rasool FakoorPratik ChaudhariStefano SoattoAlexander J. Smola
2019-09-30
Energy-aware Goal Selection and Path Planning of UAV Systems via Reinforcement Learning
A. E. Niaraki AsliJ. RoghairA. Jannesari
2019-09-26
CAQL: Continuous Action Q-Learning
Moonkyung RyuYinlam ChowRoss AndersonChristian TjandraatmadjaCraig Boutilier
2019-09-26
Active inference: demystified and compared
| Noor SajidPhilip J. BallKarl J. Friston
2019-09-24
Deep Reinforcement Learning with Modulated Hebbian plus Q Network Architecture
Pawel LadoszEseoghene Ben-IwhiwhuJeffery DickYang HuNicholas KetzSoheil KolouriJeffrey L. KrichmarPraveen PillyAndrea Soltoggio
2019-09-21
On the Convergence of Approximate and Regularized Policy Iteration Schemes
Elena SmirnovaElvis Dohmatob
2019-09-20
ModelicaGym: Applying Reinforcement Learning to Modelica Models
| Oleh LukianykhinTetiana Bogodorova
2019-09-18
Split Deep Q-Learning for Robust Object Singulation
Iason SarantopoulosMarios KiatosZoe DoulgeriSotiris Malassiotis
2019-09-17
Joint Inference of Reward Machines and Policies for Reinforcement Learning
Zhe XuIvan GavranYousef AhmadRupak MajumdarDaniel NeiderUfuk TopcuBo Wu
2019-09-12
Mutual-Information Regularization in Markov Decision Processes and Actor-Critic Learning
Felix LeibfriedJordi Grau-Moya
2019-09-11
A Multistep Lyapunov Approach for Finite-Time Analysis of Biased Stochastic Approximation
Gang WangBingcong LiGeorgios B. Giannakis
2019-09-10
Fixed-Horizon Temporal Difference Methods for Stable Reinforcement Learning
Kristopher De AsisAlan ChanSilviu PitisRichard S. SuttonDaniel Graves
2019-09-09
Multi Pseudo Q-learning Based Deterministic Policy Gradient for Tracking Control of Autonomous Underwater Vehicles
Wenjie ShiShiji SongCheng WuC. L. Philip Chen
2019-09-07
Encoders and Decoders for Quantum Expander Codes Using Machine Learning
Sathwik ChadagaMridul AgarwalVaneet Aggarwal
2019-09-06
Reinforcement Learning with Non-Markovian Rewards
Mridul AgarwalVaneet Aggarwal
2019-09-06
Q-DATA: Enhanced Traffic Flow Monitoring in Software-Defined Networks applying Q-learning
Trung V. PhanSyed Tasnimul IslamTri Gia NguyenThomas Bauschert
2019-09-04
Solving Discounted Stochastic Two-Player Games with Near-Optimal Time and Sample Complexity
Aaron SidfordMengdi WangLin F. YangYinyu Ye
2019-08-29
Networked Control of Nonlinear Systems under Partial Observation Using Continuous Deep Q-Learning
Junya IkemotoToshimitsu Ushio
2019-08-28
STMARL: A Spatio-Temporal Multi-Agent Reinforcement Learning Approach for Cooperative Traffic Light Control
Yanan WangTong XuXin NiuChang TanEnhong ChenHui Xiong
2019-08-28
Performing Deep Recurrent Double Q-Learning for Atari Games
Felipe Moreno-Vera
2019-08-16
Learn How to Cook a New Recipe in a New House: Using Map Familiarization, Curriculum Learning, and Bandit Feedback to Learn Families of Text-Based Adventure Games
Xusen YinJonathan May
2019-08-13
Large-scale Traffic Signal Control Using a Novel Multi-Agent Reinforcement Learning
Xiaoqiang WangLiangjun KeZhimin QiaoXinghua Chai
2019-08-10
Q-MIND: Defeating Stealthy DoS Attacks in SDN with a Machine-learning based Defense Framework
Trung V. PhanT M Rayhan GiasSyed Tasnimul IslamTruong Thu HuongNguyen Huu ThanhThomas Bauschert
2019-07-27
An Optimistic Perspective on Offline Reinforcement Learning
| Rishabh AgarwalDale SchuurmansMohammad Norouzi
2019-07-10
Way Off-Policy Batch Deep Reinforcement Learning of Implicit Human Preferences in Dialog
| Natasha JaquesAsma GhandehariounJudy Hanwen ShenCraig FergusonAgata LapedrizaNoah JonesShixiang GuRosalind Picard
2019-06-30
Q-Learning Inspired Self-Tuning for Energy Efficiency in HPC
Andreas GochtRobert SchöneMario Bielert
2019-06-26
Towards Empathic Deep Q-Learning
| Bart BussmannJacqueline HeinermanJoel Lehman
2019-06-26
Learning Causal State Representations of Partially Observable Environments
Amy ZhangZachary C. LiptonLuis PinedaKamyar AzizzadenesheliAnima AnandkumarLaurent IttiJoelle PineauTommaso Furlanello
2019-06-25
In Hindsight: A Smooth Reward for Steady Exploration
Hadi S. JomaaJosif GrabockaLars Schmidt-Thieme
2019-06-24
Optimal Use of Experience in First Person Shooter Environments
Matthew Aitchison
2019-06-24
Reinforcement Learning-Based Trajectory Design for the Aerial Base Stations
Behzad KhamidehiElvino S. Sousa
2019-06-23
Neural networks with motivation
Sergey A. ShuvaevNgoc B. TranMarcus Stephenson-JonesBo LiAlexei A. Koulakov
2019-06-23
A Story of Two Streams: Reinforcement Learning Models from Human Behavior and Neuropsychiatry
| Baihan LinGuillermo CecchiDjallel BouneffoufJenna ReinenIrina Rish
2019-06-21
Split Q Learning: Reinforcement Learning with Two-Stream Rewards
Baihan LinDjallel BouneffoufGuillermo Cecchi
2019-06-21
Solution of Two-Player Zero-Sum Game by Successive Relaxation
Raghuram Bharadwaj DiddigiChandramouli KamanchiShalabh Bhatnagar
2019-06-16
Boosting Soft Actor-Critic: Emphasizing Recent Experience without Forgetting the Past
| Che WangKeith Ross
2019-06-10
Exploration with Unreliable Intrinsic Reward in Multi-Agent Reinforcement Learning
Wendelin BöhmerTabish RashidShimon Whiteson
2019-06-05
Risk-Sensitive Compact Decision Trees for Autonomous Execution in Presence of Simulated Market Response
Svitlana VyetrenkoShaojie Xu
2019-06-05
Escaping the State of Nature: A Hobbesian Approach to Cooperation in Multi-agent Reinforcement Learning
William Long
2019-06-05
Reinforcement Learning with Low-Complexity Liquid State Machines
Wachirawit PonghiranGopalakrishnan SrinivasanKaushik Roy
2019-06-04
Stabilizing Off-Policy Q-Learning via Bootstrapping Error Reduction
| Aviral KumarJustin FuGeorge TuckerSergey Levine
2019-06-03
Sequential Triggers for Watermarking of Deep Reinforcement Learning Policies
Vahid BehzadanWilliam Hsu
2019-06-03
Analysis and Improvement of Adversarial Training in DQN Agents With Adversarially-Guided Exploration (AGE)
Vahid BehzadanWilliam Hsu
2019-06-03
RL-Based Method for Benchmarking the Adversarial Resilience and Robustness of Deep Reinforcement Learning Policies
Vahid BehzadanWilliam Hsu
2019-06-03
Feature-Based Q-Learning for Two-Player Stochastic Games
Zeyu JiaLin F. YangMengdi Wang
2019-06-02
RSS-Based Q-Learning for Indoor UAV Navigation
Md Moin Uddin ChowdhuryFatih ErdenIsmail Guvenc
2019-05-31
Provably Efficient Q-Learning with Low Switching Cost
Yu BaiTengyang XieNan JiangYu-Xiang Wang
2019-05-30
Reinforcement Learning for Slate-based Recommender Systems: A Tractable Decomposition and Practical Methodology
Eugene IeVihan JainJing WangSanmit NarvekarRitesh AgarwalRui WuHeng-Tze ChengMorgane LustmanVince GattoPaul CovingtonJim McFaddenTushar ChandraCraig Boutilier
2019-05-29
Solving NP-Hard Problems on Graphs with Extended AlphaGo Zero
| Kenshin AbeZijian XuIssei SatoMasashi Sugiyama
2019-05-28
Learning distant cause and effect using only local and immediate credit assignment
| David RawlinsonAbdelrahman AhmedGideon Kowadlo
2019-05-28
SQIL: Imitation Learning via Reinforcement Learning with Sparse Rewards
| Siddharth ReddyAnca D. DraganSergey Levine
2019-05-27
Finite-Sample Analysis of Nonlinear Stochastic Approximation with Applications in Reinforcement Learning
Zaiwei ChenSheng ZhangThinh T. DoanJohn-Paul ClarkeSiva Theja Maguluri
2019-05-27
A Kernel Loss for Solving the Bellman Equation
Yihao FengLihong LiQiang Liu
2019-05-25
Prioritized Sequence Experience Replay
Marc BrittainJosh BertramXuxi YangPeng Wei
2019-05-25
Adaptive Symmetric Reward Noising for Reinforcement Learning
Refael VivantiTalya D. Sohlberg-BarisShlomo CohenOrna Cohen
2019-05-24
MQLV: Optimal Policy of Money Management in Retail Banking with Q-Learning
Jeremy CharlierGaston OrmazabalRadu StateJean Hilger
2019-05-24
Deep Q-Learning with Q-Matrix Transfer Learning for Novel Fire Evacuation Environment
Jivitesh SharmaPer-Arne AndersenOle-Chrisoffer GranmoMorten Goodwin
2019-05-23
Stochastic Variance Reduction for Deep Q-learning
Wei-Ye ZhaoXi-Ya GuanYang LiuXiaoming ZhaoJian Peng
2019-05-20
Deep Reinforcement Learning Based Parameter Control in Differential Evolution
Mudita SharmaAlexandros KomninosManuel Lopez IbanezDimitar Kazakov
2019-05-20
Reinforcement Learning for Learning of Dynamical Systems in Uncertain Environment: a Tutorial
Mehran AttarMohammadreza Dabirian
2019-05-19
Mastering the Game of Sungka from Random Play
| Darwin BautistaRaimarc Dionido
2019-05-17
QBSO-FS: A Reinforcement Learning Based Bee Swarm Optimization Metaheuristic for Feature Selection
| Souhila SadegLeila HamdadAmine Riad RemacheMehdi Nedjmeddine KarechKarima BenatchbaZineb Habbas
2019-05-16
Reinforcement Learning for Robotics and Control with Active Uncertainty Reduction
Narendra PatwardhanZequn Wang
2019-05-15
Autonomous Penetration Testing using Reinforcement Learning
Jonathon SchwartzHanna Kurniawati
2019-05-15
Design of Artificial Intelligence Agents for Games using Deep Reinforcement Learning
Andrei Claudiu Roibu
2019-05-10
Domain Adversarial Reinforcement Learning for Partial Domain Adaptation
Jin ChenXinxiao WuLixin DuanShenghua Gao
2019-05-10
Pretrain Soft Q-Learning with Imperfect Demonstrations
Xiaoqin ZhangYunfei LiHuimin MaXiong Luo
2019-05-09
A Reinforcement Learning Perspective on the Optimal Control of Mutation Probabilities for the (1+1) Evolutionary Algorithm: First Results on the OneMax Problem
Luca MossinaEmmanuel RachelsonDaniel Delahaye
2019-05-09
Accelerated Target Updates for Q-learning
Bowen WengHuaqing XiongWei Zhang
2019-05-07
Deep Ordinal Reinforcement Learning
Alexander ZapTobias JoppenJohannes Fürnkranz
2019-05-06
Comprehensible Context-driven Text Game Playing
| Xusen YinJonathan May
2019-05-06
Learning agents with prioritization and parameter noise in continuous state and action space
Rajesh DevaraddiG. Srinivasaraghavan
2019-05-01
Efficient Model-free Reinforcement Learning in Metric Spaces
| Zhao SongWen Sun
2019-05-01
Beyond Games: Bringing Exploration to Robots in Real-world
Deepak PathakDhiraj GandhiAbhinav Gupta
2019-05-01
Inducing Cooperation via Learning to reshape rewards in semi-cooperative multi-agent reinforcement learning
David Earl HostalleroDaewoo KimKyunghwan SonYung Yi
2019-05-01
Recurrent Experience Replay in Distributed Reinforcement Learning
Steven KapturowskiGeorg OstrovskiWill DabneyJohn QuanRemi Munos
2019-05-01
A Deep Q-Learning Method for Downlink Power Allocation in Multi-Cell Networks
Kazi Ishfaq AhmedEkram Hossain
2019-04-30
Generative Adversarial Imagination for Sample Efficient Deep Reinforcement Learning
Kacper Kielak
2019-04-30
Zap Q-Learning for Optimal Stopping Time Problems
Shuhang ChenAdithya M. DevrajAna BušićSean P. Meyn
2019-04-25
Target-Based Temporal Difference Learning
Donghwan LeeNiao He
2019-04-24
Stochastic Lipschitz Q-Learning
Xu ZhuDavid Dunson
2019-04-24
Deep Q Learning Driven CT Pancreas Segmentation with Geometry-Aware U-Net
Yunze ManYangsibo HuangJunyi FengXi LiFei Wu
2019-04-19
"Jam Me If You Can'': Defeating Jammer with Deep Dueling Neural Network Architecture and Ambient Backscattering Augmented Communications
Nguyen Van HuynhDiep N. NguyenDinh Thai HoangEryk Dutkiewicz
2019-04-08
Personalized Cancer Chemotherapy Schedule: a numerical comparison of performance and robustness in model-based and model-free scheduling methodologies
Jesus TordesillasJuncal Arbelaiz
2019-04-02
Lane Change Decision-making through Deep Reinforcement Learning with Rule-based Constraints
Junjie WangQichao ZhangDongbin ZhaoYaran Chen
2019-03-30
Improved robustness of reinforcement learning policies upon conversion to spiking neuronal network platforms applied to ATARI games
Devdhar PatelHananel HazanDaniel J. SaundersHava SiegelmannRobert Kozma
2019-03-26
Q-Learning for Continuous Actions with Cross-Entropy Guided Policies
Riley Simmons-EdlerBen EisnerEric MitchellSebastian SeungDaniel Lee
2019-03-25
DQN with model-based exploration: efficient learning on environments with sparse rewards
Stephen Zhen GouYuyang Liu
2019-03-22
Towards Characterizing Divergence in Deep Q-Learning
Joshua AchiamEthan KnightPieter Abbeel
2019-03-21
Deep Reinforcement Learning with Decorrelation
Borislav MavrinHengshuai YaoLinglong Kong
2019-03-18
Reinforcement Learning with Dynamic Boltzmann Softmax Updates
Ling PanQingpeng CaiQi MengWei ChenLongbo HuangTie-Yan Liu
2019-03-14
Deep Multi-Agent Reinforcement Learning with Discrete-Continuous Hybrid Action Spaces
Haotian FuHongyao TangJianye HaoZihan LeiYingfeng ChenChangjie Fan
2019-03-12
Deep Recurrent Q-Learning vs Deep Q-Learning on a simple Partially Observable Markov Decision Process with Minecraft
| Clément RomacVincent Béraud
2019-03-11
Multi-Agent Deep Reinforcement Learning for Large-scale Traffic Signal Control
| Tianshu ChuJie WangLara CodecàZhaojian Li
2019-03-11
Sample-Efficient Model-Free Reinforcement Learning with Off-Policy Critics
Denis SteckelmacherHélène PlisnierDiederik M. RoijersAnn Nowé
2019-03-11
Successive Over Relaxation Q-Learning
Chandramouli KamanchiRaghuram Bharadwaj DiddigiShalabh Bhatnagar
2019-03-09
DeepPool: Distributed Model-free Algorithm for Ride-sharing using Deep Reinforcement Learning
Abubakr AlabbasiArnob GhoshVaneet Aggarwal
2019-03-09
Learning Heuristics over Large Graphs via Deep Reinforcement Learning
Sahil ManchandaAkash MittalAnuj DhawanSourav MedyaSayan RanuAmbuj Singh
2019-03-08
MinAtar: An Atari-Inspired Testbed for Thorough and Reproducible Reinforcement Learning Experiments
| Kenny YoungTian Tian
2019-03-07
Unifying Ensemble Methods for Q-learning via Social Choice Theory
Rishav ChourasiaAdish Singla
2019-02-27
Distributed Edge Caching via Reinforcement Learning in Fog Radio Access Networks
Liuyang LuYanxiang JiangMehdi BennisZhiguo DingFu-Chun ZhengXiaohu You
2019-02-27
Optimal and Fast Real-time Resources Slicing with Deep Dueling Neural Networks
Nguyen Van HuynhDinh Thai HoangDiep N. NguyenEryk Dutkiewicz
2019-02-26
Diagnosing Bottlenecks in Deep Q-learning Algorithms
| Justin FuAviral KumarMatthew SohSergey Levine
2019-02-26
Autonomous Airline Revenue Management: A Deep Reinforcement Learning Approach to Seat Inventory Control and Overbooking
Syed Arbab Mohd ShihabCaleb LogemannDeepak-George ThomasPeng Wei
2019-02-18
Heuristics, Answer Set Programming and Markov Decision Process for Solving a Set of Spatial Puzzles
Thiago Freitas dos SantosPaulo E. SantosLeonardo A. FerreiraReinaldo A. C. BianchiPedro Cabalar
2019-02-16
Sample-Optimal Parametric Q-Learning Using Linearly Additive Features
Lin F. YangMengdi Wang
2019-02-13
Learning Best Response Strategies for Agents in Ad Exchanges
Stavros GerakarisSubramanian Ramamoorthy
2019-02-10
When reinforcement learning stands out in quantum control? A comparative study on state preparation
Xiao-Ming ZhangZezhu WeiRaza AsadXu-Chen YangXin Wang
2019-02-06
Approximate Logic Synthesis: A Reinforcement Learning-Based Technology Mapping Approach
Ghasem PasandiShahin NazarianMassoud Pedram
2019-02-01
Making Deep Q-learning methods robust to time discretization
| Corentin TallecLéonard BlierYann Ollivier
2019-01-28
Q-learning with UCB Exploration is Sample Efficient for Infinite-Horizon MDP
Kefan DongYuanhao WangXiaoyu ChenLiwei Wang
2019-01-27
Reward Shaping via Meta-Learning
Haosheng ZouTongzheng RenDong YanHang SuJun Zhu
2019-01-27
Combinational Q-Learning for Dou Di Zhu
| Yang YouLiangwei LiBaisong GuoWeiming WangCewu Lu
2019-01-24
Reinforcement Learning of Markov Decision Processes with Peak Constraints
Ather Gattami
2019-01-23
Distillation Strategies for Proximal Policy Optimization
Sam GreenCraig M. VineyardÇetin Kaya Koç
2019-01-23
Understanding Multi-Step Deep Reinforcement Learning: A Systematic Study of the DQN Target
J. Fernando Hernandez-GarciaRichard S. Sutton
2019-01-22
A Deep Recurrent Q Network towards Self-adapting Distributed Microservices architecture
Basel Magableh
2019-01-13
Deep Reinforcement Learning for Imbalanced Classification
Enlu LinQiong ChenXiaoming Qi
2019-01-05
Accelerating Goal-Directed Reinforcement Learning by Model Characterization
Shoubhik DebnathGaurav SukhatmeLantao Liu
2019-01-04
A Theoretical Analysis of Deep Q-Learning
Jianqing FanZhaoran WangYuchen XieZhuoran Yang
2019-01-01
Generative Adversarial User Model for Reinforcement Learning Based Recommendation System
Xinshi ChenShuang LiHui LiShaohua JiangYuan QiLe Song
2018-12-27
Parallelized Interactive Machine Learning on Autonomous Vehicles
Xi ChenCaylin Hickey
2018-12-23
Learning to Navigate the Web
Izzeddin GurUlrich RueckertAleksandra FaustDilek Hakkani-Tur
2018-12-21
Double Deep Q-Learning for Optimal Execution
Brian NingFranco Ho Ting LinSebastian Jaimungal
2018-12-17
Decentralized Computation Offloading for Multi-User Mobile Edge Computing: A Deep Reinforcement Learning Approach
| Zhao ChenXiaodong Wang
2018-12-16
Learning Sharing Behaviors with Arbitrary Numbers of Agents
Katherine MetcalfBarry-John TheobaldNicholas Apostoloff
2018-12-10
Off-Policy Deep Reinforcement Learning without Exploration
| Scott FujimotoDavid MegerDoina Precup
2018-12-07
Power Allocation in Multi-user Cellular Networks With Deep Q Learning Approach
Fan MengPeng ChenLenan Wu
2018-12-07
Active Deep Q-learning with Demonstration
Si-An ChenVoot TangkarattHsuan-Tien LinMasashi Sugiyama
2018-12-06
Bach2Bach: Generating Music Using A Deep Reinforcement Learning Approach
Nikhil Kotecha
2018-12-03
Deep Reinforcement Learning for Intelligent Transportation Systems
Xiao-Yang LiuZihan DingSem BorstAnwar Walid
2018-12-03
Revisiting the Softmax Bellman Operator: New Benefits and New Perspective
| Zhao SongRonald E. ParrLawrence Carin
2018-12-02
Macro action selection with deep reinforcement learning in StarCraft
| Sijia XuHongyu KuangZhi ZhuangRenjie HuYang LiuHuyang Sun
2018-12-02
Non-delusional Q-learning and value-iteration
Tyler LuDale SchuurmansCraig Boutilier
2018-12-01
Deep Multi-Agent Reinforcement Learning with Relevance Graphs
| Aleksandra MalyshevaTegg Taekyong SungChae-Bong SohnDaniel KudenkoAleksei Shpilman
2018-11-30
Urban Driving with Multi-Objective Deep Reinforcement Learning
Changjian LiKrzysztof Czarnecki
2018-11-21
Reinforcement Learning with A* and a Deep Heuristic
| Ariel KeselmanSergey TenAdham GhazaliMajed Jubeh
2018-11-19
Emergence of Addictive Behaviors in Reinforcement Learning Agents
Vahid BehzadanRoman V. YampolskiyArslan Munir
2018-11-14
Managing App Install Ad Campaigns in RTB: A Q-Learning Approach
Anit Kumar SahuShaunak MishraNarayan Bhamidipati
2018-11-11
An initial attempt of combining visual selective attention with deep reinforcement learning
Liu YuezhangRuohan ZhangDana H. Ballard
2018-11-11
Deep Reinforcement Learning for Green Security Games with Real-Time Information
Yufei WangZheyuan Ryan ShiLantao YuYi WuRohit SinghLucas JoppaFei Fang
2018-11-06
Reinforcement Learning based Dynamic Model Selection for Short-Term Load Forecasting
Cong FengJie Zhang
2018-11-05
Approximate Dynamic Oracle for Dependency Parsing with Reinforcement Learning
Xiang YuNgoc Thang VuJonas Kuhn
2018-11-01
Structure Learning of Deep Neural Networks with Q-Learning
Guoqiang ZhongWencong JiaoWei Gao
2018-10-31
Distributive Dynamic Spectrum Access through Deep Reinforcement Learning: A Reservoir Computing Based Approach
Hao-Hsuan ChangHao SongYang YiJianzhong ZhangHaibo HeLingjia Liu
2018-10-28
Learning Negotiating Behavior Between Cars in Intersections using Deep Q-Learning
Tommy TramAnton JanssonRobin GrönbergMohammad AliJonas Sjöberg
2018-10-24
Reconciling $λ$-Returns with Experience Replay
| Brett DaleyChristopher Amato
2018-10-23
Actor-Expert: A Framework for using Q-learning in Continuous Action Spaces
Sungsu LimAjin JosephLei LeYangchen PanMartha White
2018-10-22
Successor Uncertainties: Exploration and Uncertainty in Temporal Difference Learning
| David JanzJiri HronPrzemysław MazurKatja HofmannJosé Miguel Hernández-LobatoSebastian Tschiatschek
2018-10-15
Assessing the Potential of Classical Q-learning in General Game Playing
| Hui WangMichael EmmerichAske Plaat
2018-10-14
Learning to Sketch with Deep Q Networks and Demonstrated Strokes
Tao ZhouChen FangZhaowen WangJimei YangByungmoon KimZhili ChenJonathan BrandtDemetri Terzopoulos
2018-10-14
Empowerment-driven Exploration using Mutual Information Estimation
| Navneet Madhu Kumar
2018-10-11
Parametrized Deep Q-Networks Learning: Reinforcement Learning with Discrete-Continuous Hybrid Action Space
| Jiechao XiongQing WangZhuoran YangPeng SunLei HanYang ZhengHaobo FuTong ZhangJi LiuHan Liu
2018-10-10
Deep Quality-Value (DQV) Learning
| Matthia SabatelliGilles LouppePierre GeurtsMarco A. Wiering
2018-09-30
Generalization and Regularization in DQN
| Jesse FarebrotherMarlos C. MachadoMichael Bowling
2018-09-29
Target Transfer Q-Learning and Its Convergence Analysis
Yue WangQi MengWei ChengYuting LiugZhi-Ming MaTie-Yan Liu
2018-09-21
Optimal Matrix Momentum Stochastic Approximation and Applications to Q-learning
Adithya M. DevrajAna BušićSean Meyn
2018-09-17
Hidden Markov Model Estimation-Based Q-learning for Partially Observable Markov Decision Process
Hyung-Jin YoonDonghwan LeeNaira Hovakimyan
2018-09-17
Sampled Policy Gradient for Learning to Play the Game Agar.io
| Anton Orell WieheNil Stolt AnsóMadalina M. DruganMarco A. Wiering
2018-09-15
Deterministic Implementations for Reproducibility in Deep Reinforcement Learning
| Prabhat NagarajanGarrett WarnellPeter Stone
2018-09-15
Towards Better Interpretability in Deep Q-Networks
| Raghuram Mandyam AnnasamyKatia Sycara
2018-09-15
Negative Update Intervals in Deep Multi-Agent Reinforcement Learning
| Gregory PalmerRahul SavaniKarl Tuyls
2018-09-13
Coordinated Heterogeneous Distributed Perception based on Latent Space Representation
Timo KorthalsJürgen LeitnerUlrich Rückert
2018-09-12
Factorized Q-Learning for Large-Scale Multi-Agent Systems
Yong ChenMing ZhouYing WenYaodong YangYufeng SuWeinan ZhangDell ZhangJun WangHan Liu
2018-09-11
Learn What Not to Learn: Action Elimination with Deep Reinforcement Learning
Tom ZahavyMatan HaroushNadav MerlisDaniel J. MankowitzShie Mannor
2018-09-06
Model-Based Regularization for Deep Reinforcement Learning with Transcoder Networks
Felix LeibfriedPeter Vrancx
2018-09-06
Directed Exploration in PAC Model-Free Reinforcement Learning
Min-hwan OhGarud Iyengar
2018-08-31
MARL-FWC: Optimal Coordination of Freeway Traffic Control Measures
Ahmed FaresWalid GomaaMohamed A. Khamis
2018-08-27
BlockQNN: Efficient Block-wise Neural Network Architecture Generation
| Zhao ZhongZichen YangBoyang DengJunjie YanWei WuJing ShaoCheng-Lin Liu
2018-08-16
Automatic Derivation Of Formulas Using Reforcement Learning
MinZhong LuoLi Liu
2018-08-15
A Framework for Automated Cellular Network Tuning with Reinforcement Learning
Faris B. MismarJinseok ChoiBrian L. Evans
2018-08-13
A Reinforcement Learning Approach to Target Tracking in a Camera Network
Anil SharmaPrabhat KumarSaket AnandSanjit K. Kaul
2018-07-26
Accelerated Structure-Aware Reinforcement Learning for Delay-Sensitive Energy Harvesting Wireless Sensors
Nikhilesh SharmaNicholas MastronardeJacob Chakareski
2018-07-22
Discrete linear-complexity reinforcement learning in continuous action spaces for Q-learning algorithms
Peyman TavallaliGary B. Doran Jr.Lukas Mandrake
2018-07-16
Is Q-learning Provably Efficient?
Chi JinZeyuan Allen-ZhuSebastien BubeckMichael I. Jordan
2018-07-10
Video Summarisation by Classification with Deep Reinforcement Learning
Kaiyang ZhouTao XiangAndrea Cavallaro
2018-07-09
Playing against Nature: causal discovery for decision making under uncertainty
M. Gonzalez-SotoL. E. SucarH. J. Escalante
2018-07-03
Learning to Explore via Meta-Policy Gradient
Tianbing XuQiang LiuLiang ZhaoJian Peng
2018-07-01
Using Reward Machines for High-Level Task Specification and Decomposition in Reinforcement Learning
Rodrigo Toro IcarteToryn KlassenRichard ValenzanoSheila McIlraith
2018-07-01
Many-Goals Reinforcement Learning
Vivek VeeriahJunhyuk OhSatinder Singh
2018-06-22
Reinforcement Learning using Augmented Neural Networks
Jack ShannonMarek Grzes
2018-06-20
Surprising Negative Results for Generative Adversarial Tree Search
| Kamyar AzizzadenesheliBrandon YangWeitang LiuZachary C LiptonAnimashree Anandkumar
2018-06-15
Qualitative Measurements of Policy Discrepancy for Return-Based Deep Q-Network
Wenjia MengQian ZhengLong YangPengfei LiGang Pan
2018-06-14
Implicit Quantile Networks for Distributional Reinforcement Learning
| Will DabneyGeorg OstrovskiDavid SilverRémi Munos
2018-06-14
Automatic formation of the structure of abstract machines in hierarchical reinforcement learning with state clustering
Aleksandr I. PanovAleksey Skrynnik
2018-06-13
Learning to Search in Long Documents Using Document Structure
| Mor GevaJonathan Berant
2018-06-09
Fidelity-based Probabilistic Q-learning for Control of Quantum Systems
Chunlin ChenDaoyi DongHan-Xiong LiJian ChuTzyh-Jong Tarn
2018-06-08
A Finite Time Analysis of Temporal Difference Learning With Linear Function Approximation
Jalaj BhandariDaniel RussoRaghav Singal
2018-06-06
Joint Power Allocation in Interference-Limited Networks via Distributed Coordinated Learning
Roohollah AmiriHani MehrpouyanDavid MatolakMaged Elkashlan
2018-06-06
Randomized Value Functions via Multiplicative Normalizing Flows
Ahmed TouatiHarsh SatijaJoshua RomoffJoelle PineauPascal Vincent
2018-06-06
Hyperparameter Optimization for Tracking With Continuous Deep Q-Learning
Xingping DongJianbing ShenWenguan WangYu LiuLing ShaoFatih Porikli
2018-06-01
Sample-Efficient Deep Reinforcement Learning via Episodic Backward Update
| Su Young LeeSungik ChoiSae-Young Chung
2018-05-31
Depth and nonlinearity induce implicit exploration for RL
Justas DauparasRyota TomiokaKatja Hofmann
2018-05-29
Episodic Memory Deep Q-Networks
Zichuan LinTianqi ZhaoGuangwen YangLintao Zhang
2018-05-19
Optimized Computation Offloading Performance in Virtual Edge Computing Systems via Deep Reinforcement Learning
Xianfu ChenHonggang ZhangCelimuge WuShiwen MaoYusheng JiMehdi Bennis
2018-05-16
Advances in Experience Replay
Tracy WanNeil Xu
2018-05-15
Planning and Learning with Stochastic Action Sets
Craig BoutilierAlon CohenAmit DanielyAvinatan HassidimYishay MansourOfer MeshiMartin MladenovDale Schuurmans
2018-05-07
A Hybrid Q-Learning Sine-Cosine-based Strategy for Addressing the Combinatorial Test Suite Minimization Problem
Kamal Z. ZamliFakhrud DinBestoun S. AhmedMiroslav Bures
2018-04-27
Multiagent Soft Q-Learning
Ermo WeiDrew WickeDavid FreelanSean Luke
2018-04-25
Benchmarking projective simulation in navigation problems
Alexey A. MelnikovAdi MakmalHans J. Briegel
2018-04-23
Towards Symbolic Reinforcement Learning with Common Sense
| Artur d'Avila GarcezAimore Resende Riquetti DutraEduardo Alonso
2018-04-23
Reinforced Co-Training
Jiawei WuLei LiWilliam Yang Wang
2018-04-17
State-Augmentation Transformations for Risk-Sensitive Reinforcement Learning
Shuai MaJia Yuan Yu
2018-04-16
CytonRL: an Efficient Reinforcement Learning Open-source Toolkit Implemented in C++
| Xiaolin Wang
2018-04-14
MOVI: A Model-Free Approach to Dynamic Fleet Management
Takuma OdaCarlee Joe-Wong
2018-04-13
Hierarchical Modular Reinforcement Learning Method and Knowledge Acquisition of State-Action Rule for Multi-target Problem
Takumi IchimuraDaisuke Igaue
2018-04-08
Reinforcement Learning based QoS/QoE-aware Service Function Chaining in Software-Driven 5G Slices
Xi ChenZonghang LiYupeng ZhangRuiming LongHongfang YuXiaojiang DuMohsen Guizani
2018-04-06
Joint Learning of Interactive Spoken Content Retrieval and Trainable User Simulator
Pei-Hung ChungKuan TungChing-Lun TaiHung-Yi Lee
2018-04-01
Learning Synergies between Pushing and Grasping with Self-supervised Deep Reinforcement Learning
| Andy ZengShuran SongStefan WelkerJohnny LeeAlberto RodriguezThomas Funkhouser
2018-03-27
Natural Gradient Deep Q-learning
| Ethan KnightOsher Lerner
2018-03-20
Composable Deep Reinforcement Learning for Robotic Manipulation
| Tuomas HaarnojaVitchyr PongAurick ZhouMurtaza DalalPieter AbbeelSergey Levine
2018-03-19
A Machine Learning Approach for Power Allocation in HetNets Considering QoS
Roohollah AmiriHani MehrpouyanLex FridmanRanjan K. MallikArumugam NallanathanDavid Matolak
2018-03-18
Learning to Explore with Meta-Policy Gradient
Tianbing XuQiang LiuLiang ZhaoJian Peng
2018-03-13
Deep reinforcement learning for time series: playing idealized trading games
| Xiang Gao
2018-03-11
Q-CP: Learning Action Values for Cooperative Planning
Francesco RiccioRoberto CapobiancoDaniele Nardi
2018-03-01
Variance Reduction Methods for Sublinear Reinforcement Learning
Sham KakadeMengdi WangLin F. Yang
2018-02-26
Weighted Double Deep Multiagent Reinforcement Learning in Stochastic Cooperative Environments
Yan ZhengJianye HaoZongzhang Zhang
2018-02-23
A Deep Q-Learning Agent for the L-Game with Variable Batch Training
Petros GiannakopoulosYannis Cotronis
2018-02-17
Monte Carlo Q-learning for General Game Playing
| Hui WangMichael EmmerichAske Plaat
2018-02-16
Mean Field Multi-Agent Reinforcement Learning
| Yaodong YangRui LuoMinne LiMing ZhouWeinan ZhangJun Wang
2018-02-15
Efficient Exploration through Bayesian Deep Q-Networks
| Kamyar AzizzadenesheliAnimashree Anandkumar
2018-02-13
Q-learning with Nearest Neighbors
Devavrat ShahQiaomin Xie
2018-02-12
Balancing Two-Player Stochastic Games with Soft Q-Learning
Jordi Grau-MoyaFelix LeibfriedHaitham Bou-Ammar
2018-02-09
Deep Reinforcement Learning using Capsules in Advanced Game Environments
Per-Arne Andersen
2018-01-29
The QLBS Q-Learner Goes NuQLear: Fitted Q Iteration, Inverse RL, and Option Portfolios
Igor Halperin
2018-01-17
Deep Reinforcement Fuzzing
Konstantin BöttingerPatrice GodefroidRishabh Singh
2018-01-14
DeepTraffic: Crowdsourced Hyperparameter Tuning of Deep Reinforcement Learning Systems for Multi-Agent Dense Traffic Navigation
| Lex FridmanJack TerwilligerBenedikt Jenik
2018-01-09
Faster Deep Q-learning using Neural Episodic Control
Daichi NishioSatoshi Yamane
2018-01-06
Avoiding Catastrophic States with Intrinsic Fear
Zachary C. LiptonKamyar AzizzadenesheliAbhishek KumarLihong LiJianfeng GaoLi Deng
2018-01-01
Representing Entropy : A short proof of the equivalence between soft Q-learning and policy gradients
Pierre H. RichemondBrendan Maginnis
2018-01-01
Autonomous Vehicle Fleet Coordination With Deep Reinforcement Learning
Cane Punma
2018-01-01
TD Learning with Constrained Gradients
Ishan DurugkarPeter Stone
2018-01-01
PARAMETRIZED DEEP Q-NETWORKS LEARNING: PLAYING ONLINE BATTLE ARENA WITH DISCRETE-CONTINUOUS HYBRID ACTION SPACE
Jiechao XiongQing WangZhuoran YangPeng SunYang ZhengLei HanHaobo FuXiangru LianCarson EisenachHaichuan YangEmmanuel EkwedikeBei PengHaoyue GaoTong ZhangJi LiuHan Liu
2018-01-01
Faster Reinforcement Learning with Expert State Sequences
Xiaoxiao GuoShiyu ChangMo YuMiao LiuGerald Tesauro
2018-01-01
A short variational proof of equivalence between policy gradients and soft Q learning
Pierre H. RichemondBrendan Maginnis
2017-12-22
A Deep Policy Inference Q-Network for Multi-Agent Systems
Zhang-Wei HongShih-Yang SuTzu-Yun ShannYi-Hsiang ChangChun-Yi Lee
2017-12-21
Improving Exploration in Evolution Strategies for Deep Reinforcement Learning via a Population of Novelty-Seeking Agents
| Edoardo ContiVashisht MadhavanFelipe Petroski SuchJoel LehmanKenneth O. StanleyJeff Clune
2017-12-18
Deep Neuroevolution: Genetic Algorithms Are a Competitive Alternative for Training Deep Neural Networks for Reinforcement Learning
| Felipe Petroski SuchVashisht MadhavanEdoardo ContiJoel LehmanKenneth O. StanleyJeff Clune
2017-12-18
Towards a Deep Reinforcement Learning Approach for Tower Line Wars
Per-Arne AndersenMorten GoodwinOle-Christoffer Granmo
2017-12-17
QLBS: Q-Learner in the Black-Scholes(-Merton) Worlds
Igor Halperin
2017-12-13
Assumed Density Filtering Q-learning
| Heejin JeongClark ZhangGeorge J. PappasDaniel D. Lee
2017-12-09
Deep Primal-Dual Reinforcement Learning: Accelerating Actor-Critic using Bellman Duality
Woon Sang ChoMengdi Wang
2017-12-07
Zap Q-Learning
Adithya M DevrajSean Meyn
2017-12-01
Q-LDA: Uncovering Latent Patterns in Text-based Sequential Decision Processes
Jianshu ChenChong WangLin XiaoJi HeLihong LiLi Deng
2017-12-01
Uncertainty Estimates for Efficient Neural Network-based Dialogue Policy Optimisation
Christopher TeghoPaweł BudzianowskiMilica Gašić
2017-11-30
A Benchmarking Environment for Reinforcement Learning Based Task Oriented Dialogue Management
Iñigo CasanuevaPaweł BudzianowskiPei-Hao SuNikola MrkšićTsung-Hsien WenStefan UltesLina Rojas-BarahonaSteve YoungMilica Gašić
2017-11-29
Implementing the Deep Q-Network
| Melrose RoderickJames MacGlashanStefanie Tellex
2017-11-20
Neural Network Based Reinforcement Learning for Audio-Visual Gaze Control in Human-Robot Interaction
Stéphane LathuilièreBenoit MasséPablo MesejoRadu Horaud
2017-11-18
BBQ-Networks: Efficient Exploration in Deep Reinforcement Learning for Task-Oriented Dialogue Systems
Zachary LiptonXiujun LiJianfeng GaoLihong LiFaisal AhmedLi Deng
2017-11-15
A unified decision making framework for supply and demand management in microgrid networks
Diddigi Raghuram BharadwajSai Koti Reddy DandaKrishnasuri NarayanamShalabh Bhatnagar
2017-11-14
Double Q($σ$) and Q($σ, λ$): Unifying Reinforcement Learning Control Algorithms
Markus Dumke
2017-11-05
TreeQN and ATreeC: Differentiable Tree-Structured Models for Deep Reinforcement Learning
| Gregory FarquharTim RocktäschelMaximilian IglShimon Whiteson
2017-10-31
Distributional Reinforcement Learning with Quantile Regression
| Will DabneyMark RowlandMarc G. BellemareRémi Munos
2017-10-27
The Effects of Memory Replay in Reinforcement Learning
Ruishan LiuJames Zou
2017-10-18
Deep Reinforcement Learning: Framework, Applications, and Embedded Implementations
Hongjia LiTianshu WeiAo RenQi ZhuYanzhi Wang
2017-10-10
Rainbow: Combining Improvements in Deep Reinforcement Learning
| Matteo HesselJoseph ModayilHado van HasseltTom SchaulGeorg OstrovskiWill DabneyDan HorganBilal PiotMohammad AzarDavid Silver
2017-10-06
Supervised Q-walk for Learning Vector Representation of Nodes in Networks
Naimish AgarwalG. C. Nandi
2017-10-03
A Simple Reinforcement Learning Mechanism for Resource Allocation in LTE-A Networks with Markov Decision Process and Q-Learning
Einar Cesar Santos
2017-09-27
An Optimal Online Method of Selecting Source Policies for Reinforcement Learning
Siyuan LiChongjie Zhang
2017-09-24
Improving Search through A3C Reinforcement Learning based Conversational Agent
Milan AggarwalAarushi AroraShagun SodhaniBalaji Krishnamurthy
2017-09-17
Deep Reinforcement Learning with Surrogate Agent-Environment Interface
Song WangYu Jing
2017-09-12
Pre-training Neural Networks with Human Demonstrations for Deep Reinforcement Learning
Gabriel V. de la Cruz JrYunshu DuMatthew E. Taylor
2017-09-12
Formulation of Deep Reinforcement Learning Architecture Toward Autonomous Driving for On-Ramp Merge
Pin WangChing-Yao Chan
2017-09-07
BIBI System Description: Building with CNNs and Breaking with Deep Reinforcement Learning
Yitong LiTrevor CohnTimothy Baldwin
2017-09-01
Multi-Agent Q-Learning for Minimizing Demand-Supply Power Deficit in Microgrids
Raghuram Bharadwaj DiddigiD. Sai Koti ReddyShalabh Bhatnagar
2017-08-25
Practical Block-wise Neural Network Architecture Generation
Zhao ZhongJunjie YanWei WuJing ShaoCheng-Lin Liu
2017-08-18
LADDER: A Human-Level Bidding Agent for Large-Scale Real-Time Online Auctions
Yu WangJiayi LiuYuxiang LiuJun HaoYang HeJinghe HuWeipeng P. YanMantian Li
2017-08-18
Investigating Reinforcement Learning Agents for Continuous State Space Environments
David Von Dollen
2017-08-08
3DCNN-DQN-RNN: A Deep Reinforcement Learning Framework for Semantic Parsing of Large-scale 3D Point Clouds
Fangyu LiuShuaipeng LiLiqiang ZhangChenghu ZhouRongtian YeYuebin WangJiwen Lu
2017-07-21
Empirical evaluation of a Q-Learning Algorithm for Model-free Autonomous Soaring
Erwan LecarpentierSebastian RappMarc MeloEmmanuel Rachelson
2017-07-18
Fastest Convergence for Q-learning
Adithya M. DevrajSean P. Meyn
2017-07-12
Noisy Networks for Exploration
| Meire FortunatoMohammad Gheshlaghi AzarBilal PiotJacob MenickIan OsbandAlex GravesVlad MnihRemi MunosDemis HassabisOlivier PietquinCharles BlundellShane Legg
2017-06-30
A Self-Adaptive Proposal Model for Temporal Action Detection based on Reinforcement Learning
| Jingjia HuangNannan LiTao ZhangGe Li
2017-06-22
Generalized Value Iteration Networks: Life Beyond Lattices
| Sufeng NiuSiheng ChenHanyu GuoColin TargonskiMelissa C. SmithJelena Kovačević
2017-06-08
Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments
| Ryan LoweYi WuAviv TamarJean HarbPieter AbbeelIgor Mordatch
2017-06-07
Parameter Space Noise for Exploration
| Matthias PlappertRein HouthooftPrafulla DhariwalSzymon SidorRichard Y. ChenXi ChenTamim AsfourPieter AbbeelMarcin Andrychowicz
2017-06-06
Explaining Transition Systems through Program Induction
Svetlin PenkovSubramanian Ramamoorthy
2017-05-23
Shallow Updates for Deep Reinforcement Learning
Nir LevineTom ZahavyDaniel J. MankowitzAviv TamarShie Mannor
2017-05-21
A Comparison of Reinforcement Learning Techniques for Fuzzy Cloud Auto-Scaling
Hamid ArabnejadClaus PahlPooyan JamshidiGiovani Estrada
2017-05-19
Learning to Represent Haptic Feedback for Partially-Observable Tasks
Jaeyong SungJ. Kenneth SalisburyAshutosh Saxena
2017-05-17
Learning Hard Alignments with Variational Inference
Dieterich LawsonChung-Cheng ChiuGeorge TuckerColin RaffelKevin SwerskyNavdeep Jaitly
2017-05-16
Deep Episodic Value Iteration for Model-based Meta-Reinforcement Learning
Steven Stenberg Hansen
2017-05-09
Reinforcement Learning with External Knowledge and Two-Stage Q-functions for Predicting Popular Reddit Threads
Ji HeMari OstendorfXiaodong He
2017-04-20
The Reactor: A fast and sample-efficient Actor-Critic agent for Reinforcement Learning
Audrunas GruslysWill DabneyMohammad Gheshlaghi AzarBilal PiotMarc BellemareRemi Munos
2017-04-15
Deep Q-learning from Demonstrations
| Todd HesterMatej VecerikOlivier PietquinMarc LanctotTom SchaulBilal PiotDan HorganJohn QuanAndrew SendonarisGabriel Dulac-ArnoldIan OsbandJohn AgapiouJoel Z. LeiboAudrunas Gruslys
2017-04-12
Data-efficient Deep Reinforcement Learning for Dexterous Manipulation
Ivaylo PopovNicolas HeessTimothy LillicrapRoland HafnerGabriel Barth-MaronMatej VecerikThomas LampeYuval TassaTom ErezMartin Riedmiller
2017-04-10
Pseudorehearsal in value function approximation
Vladimir MarochkoLeonard JohardManuel Mazzara
2017-03-21
Vision-based Robotic Arm Imitation by Human Gesture
Cheng XuanZhiqiang TangJinxin Xu
2017-03-15
Evolution Strategies as a Scalable Alternative to Reinforcement Learning
| Tim SalimansJonathan HoXi ChenSzymon SidorIlya Sutskever
2017-03-10
Tactics of Adversarial Attack on Deep Reinforcement Learning Agents
Yen-Chen LinZhang-Wei HongYuan-Hong LiaoMeng-Li ShihMing-Yu LiuMin Sun
2017-03-08
Count-Based Exploration with Neural Density Models
Georg OstrovskiMarc G. BellemareAaron van den OordRemi Munos
2017-03-03
Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning
| Jakob FoersterNantas NardelliGregory FarquharTriantafyllos AfourasPhilip H. S. TorrPushmeet KohliShimon Whiteson
2017-02-28
Bridging the Gap Between Value and Policy Based Reinforcement Learning
Ofir NachumMohammad NorouziKelvin XuDale Schuurmans
2017-02-28
Learning Control for Air Hockey Striking using Deep Reinforcement Learning
Ayal TaitlerNahum Shimkin
2017-02-26
Sigmoid-Weighted Linear Units for Neural Network Function Approximation in Reinforcement Learning
Stefan ElfwingEiji UchibeKenji Doya
2017-02-10
Autonomous Braking System via Deep Reinforcement Learning
Hyunmin ChaeChang Mook KangByeoungDo KimJaekyum KimChung Choo ChungJun Won Choi
2017-02-08
FPGA Architecture for Deep Learning and its application to Planetary Robotics
Pranay GankidiJekan Thangavelautham
2017-01-26
Vulnerability of Deep Reinforcement Learning to Policy Induction Attacks
Vahid BehzadanArslan Munir
2017-01-16
Deep Reinforcement Learning for Multi-Domain Dialogue Systems
| Heriberto CuayáhuitlSeunghak YuAshley WilliamsonJacob Carse
2016-11-26
Memory Lens: How Much Memory Does an Agent Use?
Christoph DannKatja HofmannSebastian Nowozin
2016-11-21
Averaged-DQN: Variance Reduction and Stabilization for Deep Reinforcement Learning
| Oron AnschelNir BaramNahum Shimkin
2016-11-07
Learning to Play in a Day: Faster Deep Reinforcement Learning by Optimality Tightening
| Frank S. HeYang LiuAlexander G. SchwingJian Peng
2016-11-05
Combating Reinforcement Learning's Sisyphean Curse with Intrinsic Fear
Zachary C. LiptonKamyar AzizzadenesheliAbhishek KumarLihong LiJianfeng GaoLi Deng
2016-11-03
Using a Deep Reinforcement Learning Agent for Traffic Signal Control
Wade GendersSaiedeh Razavi
2016-11-03
Internet of Things Applications: Animal Monitoring with Unmanned Aerial Vehicle
Jun XuGurkan SolmazRouhollah RahmatizadehDamla TurgutLadislau Boloni
2016-10-17
Deep Reinforcement Learning From Raw Pixels in Doom
Danijar Hafner
2016-10-07
Q-Learning for Robust Satisfaction of Signal Temporal Logic Specifications
Derya AksarayAustin JonesZhaodan KongMac SchwagerCalin Belta
2016-09-23
Opponent Modeling in Deep Reinforcement Learning
He HeJordan Boyd-GraberKevin KwokHal Daumé III
2016-09-18
Episodic Exploration for Deep Deterministic Policies: An Application to StarCraft Micromanagement Tasks
Nicolas UsunierGabriel SynnaeveZeming LinSoumith Chintala
2016-09-10
Multi Exit Configuration of Mesoscopic Pedestrian Simulation
Allan LaoKardi Teknomo
2016-09-06
Extending the OpenAI Gym for robotics: a toolkit for reinforcement learning using ROS and Gazebo
Iker ZamoraNestor Gonzalez LopezVictor Mayoral VilchesAlejandro Hernandez Cordero
2016-08-19
BBQ-Networks: Efficient Exploration in Deep Reinforcement Learning for Task-Oriented Dialogue Systems
Zachary C. LiptonXiujun LiJianfeng GaoLihong LiFaisal AhmedLi Deng
2016-08-17
Deep Reinforcement Learning Discovers Internal Models
Nir BaramTom ZahavyShie Mannor
2016-06-16
Deep Reinforcement Learning With Macro-Actions
Ishan P. DurugkarClemens RosenbaumStefan DernbachSridhar Mahadevan
2016-06-15
ViZDoom: A Doom-based AI Research Platform for Visual Reinforcement Learning
| Michał KempkaMarek WydmuchGrzegorz RuncJakub ToczekWojciech Jaśkowski
2016-05-06
Classifying Options for Deep Reinforcement Learning
Kai ArulkumaranNat DilokthanakulMurray ShanahanAnil Anthony Bharath
2016-04-27
Neurohex: A Deep Q-learning Hex Agent
Kenny YoungRyan HaywardGautham Vasan
2016-04-24
Continuous Deep Q-Learning with Model-based Acceleration
| Shixiang GuTimothy LillicrapIlya SutskeverSergey Levine
2016-03-02
Reinforcement Learning approach for Real Time Strategy Games Battle city and S3
Harshit SethyAmit Patel
2016-02-16
Deep Exploration via Bootstrapped DQN
| Ian OsbandCharles BlundellAlexander PritzelBenjamin Van Roy
2016-02-15
Using Deep Q-Learning to Control Optimization Hyperparameters
Samantha Hansen
2016-02-12
Angrier Birds: Bayesian reinforcement learning
Imanol Arrieta IbarraBernardo RamosLars Roemheld
2016-01-06
How to Discount Deep Reinforcement Learning: Towards New Dynamic Strategies
Vincent François-LavetRaphael FonteneauDamien Ernst
2015-12-07
Deep Attention Recurrent Q-Network
| Ivan SorokinAlexey SeleznevMikhail PavlovAleksandr FedorovAnastasiia Ignateva
2015-12-05
State of the Art Control of Atari Games Using Shallow Reinforcement Learning
Yitao LiangMarlos C. MachadoErik TalvitieMichael Bowling
2015-12-04
Deep Reinforcement Learning with Attention for Slate Markov Decision Processes with High-Dimensional States and Actions
Peter SunehagRichard EvansGabriel Dulac-ArnoldYori ZwolsDaniel VisentinBen Coppin
2015-12-03
Multiagent Cooperation and Competition with Deep Reinforcement Learning
| Ardi TampuuTambet MatiisenDorian KodeljaIlya KuzovkinKristjan KorjusJuhan AruJaan AruRaul Vicente
2015-11-27
Policy Distillation
| Andrei A. RusuSergio Gomez ColmenarejoCaglar GulcehreGuillaume DesjardinsJames KirkpatrickRazvan PascanuVolodymyr MnihKoray KavukcuogluRaia Hadsell
2015-11-19
Prioritized Experience Replay
| Tom SchaulJohn QuanIoannis AntonoglouDavid Silver
2015-11-18
Deep Reinforcement Learning with a Natural Language Action Space
| Ji HeJianshu ChenXiaodong HeJianfeng GaoLihong LiLi DengMari Ostendorf
2015-11-14
A disembodied developmental robotic agent called Samu Bátfai
| Norbert Bátfai
2015-11-09
Generating Text with Deep Reinforcement Learning
Hongyu Guo
2015-10-30
Deep Reinforcement Learning with Double Q-learning
| Hado van HasseltArthur GuezDavid Silver
2015-09-22
Optimization of anemia treatment in hemodialysis patients via reinforcement learning
Pablo Escandell-MonteroMilena ChermisiJosé M. Martínez-MartínezJuan Gómez-SanchisCarlo BarbieriEmilio Soria-OlivasFlavio MariJoan Vila-FrancésAndrea StopperEmanuele GattiJosé D. Martín-Guerrero
2015-09-14
Continuous control with deep reinforcement learning
| Timothy P. LillicrapJonathan J. HuntAlexander PritzelNicolas HeessTom ErezYuval TassaDavid SilverDaan Wierstra
2015-09-09
Artificial Prediction Markets for Online Prediction of Continuous Variables-A Preliminary Report
Fatemeh JahedpariMarina De VosSattar HashemiBenjamin HirschJulian Padget
2015-08-11
Massively Parallel Methods for Deep Reinforcement Learning
| Arun NairPraveen SrinivasanSam BlackwellCagdas AlcicekRory FearonAlessandro De MariaVedavyas PanneershelvamMustafa SuleymanCharles BeattieStig PetersenShane LeggVolodymyr MnihKoray KavukcuogluDavid Silver
2015-07-15
Online Transfer Learning in Reinforcement Learning Domains
Yusen ZhanMatthew E. Taylor
2015-07-02
Autonomous CRM Control via CLV Approximation with Deep Reinforcement Learning in Discrete and Continuous Action Space
Yegor Tkachenko
2015-04-08
Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning
Xiaoxiao GuoSatinder SinghHonglak LeeRichard L. LewisXiaoshi Wang
2014-12-01
Empirical Q-Value Iteration
Dileep KalathilVivek S. BorkarRahul Jain
2014-11-30
Q-learning for Optimal Control of Continuous-time Systems
Biao LuoDerong LiuTingwen Huang
2014-10-11
Reinforcement Learning Based Algorithm for the Maximization of EV Charging Station Revenue
Stoyan DimitrovRedouane Lguensat
2014-07-04
Personalized Medical Treatments Using Novel Reinforcement Learning Algorithms
Yousuf M. Soliman
2014-06-16
Two Timescale Convergent Q-learning for Sleep--Scheduling in Wireless Sensor Networks
Prashanth L. A.Abhranil ChatterjeeShalabh Bhatnagar
2013-12-27
Playing Atari with Deep Reinforcement Learning
| Volodymyr MnihKoray KavukcuogluDavid SilverAlex GravesIoannis AntonoglouDaan WierstraMartin Riedmiller
2013-12-19
Q-learning optimization in a multi-agents system for image segmentation
Issam QaffouMohamed SadgalAbdelaziz Elfazziki
2013-11-23
Risk-sensitive Reinforcement Learning
Yun ShenMichael J. TobiaTobias SommerKlaus Obermayer
2013-11-08
Approximate Kalman Filter Q-Learning for Continuous State-Space MDPs
Charles TrippRoss D. Shachter
2013-09-26
Projective simulation for classical learning agents: a comprehensive investigation
Julian MautnerAdi MakmalDaniel ManzanoMarkus TierschHans J. Briegel
2013-05-07
Speedy Q-Learning
Mohammad GhavamzadehHilbert J. KappenMohammad G. AzarRémi Munos
2011-12-01
Double Q-learning
Hado V. Hasselt
2010-12-01
Convergent Temporal-Difference Learning with Arbitrary Smooth Function Approximation
Shalabh BhatnagarDoina PrecupDavid SilverRichard S. SuttonHamid R. MaeiCsaba Szepesvári
2009-12-01

Components

COMPONENT TYPE
🤖 No Components Found You can add them if they exist; e.g. Mask R-CNN uses RoIAlign

Categories