Search Results for author: Yangyang Zhao

Found 3 papers, 0 papers with code

Efficient Dialogue Complementary Policy Learning via Deep Q-network Policy and Episodic Memory Policy

no code implementations EMNLP 2021 Yangyang Zhao, Zhenyu Wang, Changxi Zhu, Shihan Wang

Most of the existing dialogue policy methods rely on a single learning system, while the human brain has two specialized learning and memory systems, supporting to find good solutions without requiring copious examples.

Rescue Conversations from Dead-ends: Efficient Exploration for Task-oriented Dialogue Policy Optimization

no code implementations5 May 2023 Yangyang Zhao, Zhenyu Wang, Mehdi Dastani, Shihan Wang

When a conversation enters a dead-end state, regardless of the actions taken afterward, it will continue in a dead-end trajectory until the agent reaches a termination state or maximum turn.

Data Augmentation Efficient Exploration

Automatic Curriculum Learning With Over-repetition Penalty for Dialogue Policy Learning

no code implementations28 Dec 2020 Yangyang Zhao, Zhenyu Wang, Zhenhua Huang

We propose a novel framework, Automatic Curriculum Learning-based Deep Q-Network (ACL-DQN), which replaces the traditional random sampling method with a teacher policy model to realize the dialogue policy for automatic curriculum learning.

Cannot find the paper you are looking for? You can Submit a new open access paper.