Sample Efficient Deep Reinforcement Learning for Dialogue Systems with Large Action Spaces

In spoken dialogue systems, we aim to deploy artificial intelligence to build automated dialogue agents that can converse with humans. A part of this effort is the policy optimisation task, which attempts to find a policy describing how to respond to humans, in the form of a function taking the current state of the dialogue and returning the response of the system... (read more)

PDF Abstract
No code implementations yet. Submit your code now

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods used in the Paper


METHOD TYPE
Experience Replay
Replay Memory
Retrace
Value Function Estimation
TRPO
Policy Gradient Methods
Entropy Regularization
Regularization
Stochastic Dueling Network
Value Function Estimation
Dense Connections
Feedforward Networks
ReLU
Activation Functions
Softmax
Output Functions
Convolution
Convolutions
ACER
Policy Gradient Methods