Uni-MIS: United Multiple Intent Spoken Language Understanding via Multi-View Intent-Slot Interaction

Proceedings of the AAAI Conference on Artificial Intelligence 2024 · Shangjian Yin, Peijie Huang, Yuhong Xu ·

So far, multi-intent spoken language understanding (SLU) has become a research hotspot in the field of natural language processing (NLP) due to its ability to recognize and extract multiple intents expressed and annotate corresponding sequence slot tags within a single utterance. Previous research has primarily concentrated on the token-level intent-slot interaction to model joint intent detection and slot filling, which resulted in a failure to fully utilize anisotropic intent-guiding information during joint training. In this work, we present a novel architecture by modeling the multi-intent SLU as a multi-view intent-slot interaction. The architecture resolves the kernel bottleneck of unified multi-intent SLU by effectively modeling the intent-slot relations with utterance, chunk, and token-level interaction. We further develop a neural framework, namely Uni-MIS, in which the unified multi-intent SLU is modeled as a three-view intent-slot interaction fusion to better capture the interaction information after special encoding. A chunk-level intent detection decoder is used to sufficiently capture the multi-intent, and an adaptive intent-slot graph network is used to capture the fine-grained intent information to guide final slot filling. We perform extensive experiments on two widely used benchmark datasets for multi-intent SLU, where our model bets on all the current strong baselines, pushing the state-of-the-art performance of unified multi-intent SLU. Additionally, the ChatGPT benchmark that we have developed demonstrates that there is a considerable amount of potential research value in the field of multi-intent SLU.

PDF Abstract