Adaptive Vague Preference Policy Learning for Multi-round Conversational Recommendation

Conversational recommendation systems (CRS) aim to elicit user preferences and provide satisfying recommendations through natural language interactions. Existing CRS methods often assume that users have clear and consistent preferences, which may not reflect the reality of user decision-making processes. In this paper, we introduce a novel scenario called Vague Preference Multi-round Conversational Recommendation (VPMCR), which considers users' vague and dynamic preferences in CRS. In the VPMCR setting, we propose a solution called Adaptive Vague Preference Policy Learning (AVPPL), which consists of two components: Ambiguity-aware Soft Estimation (ASE) and Dynamism-aware Policy Learning (DPL). ASE estimates the vagueness of user feedback and captures their dynamic preferences using a choice-based preference extraction module and a time-aware decaying strategy. DPL leverages the preference distribution estimated by ASE to guide the conversation and adapt to changes in user preferences using a graph-based conversation modeling module and a vague preference policy learning module. We conduct extensive experiments on four real-world datasets and demonstrate the effectiveness of our method in the VPMCR scenario, setting a new benchmark for future research in CRS.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods