To Combine or Not To Combine? A Rainbow Deep Reinforcement Learning Agent for Dialog Policies

In this paper, we explore state-of-the-art deep reinforcement learning methods for dialog policy training such as prioritized experience replay, double deep Q-Networks, dueling network architectures and distributional learning. Our main findings show that each individual method improves the rewards and the task success rate but combining these methods in a Rainbow agent, which performs best across tasks and environments, is a non-trivial task... (read more)

PDF Abstract
No code implementations yet. Submit your code now

Tasks


Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods used in the Paper


METHOD TYPE
Double Q-learning
Off-Policy TD Control
Dense Connections
Feedforward Networks
Convolution
Convolutions
Dueling Network
Q-Learning Networks