no code implementations • EMNLP 2021 • Yangyang Zhao, Zhenyu Wang, Changxi Zhu, Shihan Wang
Most of the existing dialogue policy methods rely on a single learning system, while the human brain has two specialized learning and memory systems, supporting to find good solutions without requiring copious examples.
no code implementations • 5 May 2023 • Yangyang Zhao, Zhenyu Wang, Mehdi Dastani, Shihan Wang
When a conversation enters a dead-end state, regardless of the actions taken afterward, it will continue in a dead-end trajectory until the agent reaches a termination state or maximum turn.
no code implementations • 28 Dec 2020 • Yangyang Zhao, Zhenyu Wang, Zhenhua Huang
We propose a novel framework, Automatic Curriculum Learning-based Deep Q-Network (ACL-DQN), which replaces the traditional random sampling method with a teacher policy model to realize the dialogue policy for automatic curriculum learning.