Adaptive Vague Preference Policy Learning for Multi-round Conversational Recommendation

7 Jun 2023 · Gangyi Zhang, Chongming Gao, Wenqiang Lei, Xiaojie Guo, Shijun Li, Hongshen Chen, Zhuozhi Ding, Sulong Xu, Lingfei Wu ·

Conversational recommendation systems (CRS) aim to elicit user preferences and provide satisfying recommendations through natural language interactions. Existing CRS methods often assume that users have clear and consistent preferences, which may not reflect the reality of user decision-making processes. In this paper, we introduce a novel scenario called Vague Preference Multi-round Conversational Recommendation (VPMCR), which considers users' vague and dynamic preferences in CRS. In the VPMCR setting, we propose a solution called Adaptive Vague Preference Policy Learning (AVPPL), which consists of two components: Ambiguity-aware Soft Estimation (ASE) and Dynamism-aware Policy Learning (DPL). ASE estimates the vagueness of user feedback and captures their dynamic preferences using a choice-based preference extraction module and a time-aware decaying strategy. DPL leverages the preference distribution estimated by ASE to guide the conversation and adapt to changes in user preferences using a graph-based conversation modeling module and a vague preference policy learning module. We conduct extensive experiments on four real-world datasets and demonstrate the effectiveness of our method in the VPMCR scenario, setting a new benchmark for future research in CRS.

PDF Abstract