Search Results for author: Zhengxu Hou

Found 1 papers, 1 papers with code

Imperfect also Deserves Reward: Multi-Level and Sequential Reward Modeling for Better Dialog Management

1 code implementation NAACL 2021 Zhengxu Hou, Bang Liu, Ruihui Zhao, Zijing Ou, Yafei Liu, Xi Chen, Yefeng Zheng

For task-oriented dialog systems, training a Reinforcement Learning (RL) based Dialog Management module suffers from low sample efficiency and slow convergence speed due to the sparse rewards in RL. To solve this problem, many strategies have been proposed to give proper rewards when training RL, but their rewards lack interpretability and cannot accurately estimate the distribution of state-action pairs in real dialogs.

Management reinforcement-learning +1

Cannot find the paper you are looking for? You can Submit a new open access paper.