Spoken Dialogue Systems
19 papers with code • 0 benchmarks • 2 datasets
Benchmarks
These leaderboards are used to track progress in Spoken Dialogue Systems
Libraries
Use these libraries to find Spoken Dialogue Systems models and implementationsLatest papers with no code
Acoustic Modeling for End-to-End Empathetic Dialogue Speech Synthesis Using Linguistic and Prosodic Contexts of Dialogue History
To train the empathetic DSS model effectively, we investigate 1) a self-supervised learning model pretrained with large speech corpora, 2) a style-guided training using a prosody embedding of the current utterance to be predicted by the dialogue context embedding, 3) a cross-modal attention to combine text and speech modalities, and 4) a sentence-wise embedding to achieve fine-grained prosody modeling rather than utterance-wise modeling.
Understanding How People Rate Their Conversations
In this work, we conduct a study to better understand how people rate their interactions with conversational agents.
Duplex Conversation: Towards Human-like Interaction in Spoken Dialogue Systems
In this paper, we present Duplex Conversation, a multi-turn, multimodal spoken dialogue system that enables telephone-based agents to interact with customers like a human.
NLU for Game-based Learning in Real: Initial Evaluations
Intelligent systems designed for play-based interactions should be contextually aware of the users and their surroundings.
Data Augmentation with Paraphrase Generation and Entity Extraction for Multimodal Dialogue System
Contextually aware intelligent agents are often required to understand the users and their surroundings in real-time.
Gated Multimodal Fusion with Contrastive Learning for Turn-taking Prediction in Human-robot Dialogue
Turn-taking, aiming to decide when the next speaker can start talking, is an essential component in building human-robot spoken dialogue systems.
Dialogue Strategy Adaptation to New Action Sets Using Multi-dimensional Modelling
A major bottleneck for building statistical spoken dialogue systems for new domains and applications is the need for large amounts of training data.
A Cross-Domain Approach for Continuous Impression Recognition from Dyadic Audio-Visual-Physio Signals
In this paper, we perform impression recognition using a proposed cross-domain architecture on the dyadic IMPRESSION dataset.
EVI: Multilingual Spoken Dialogue Tasks and Dataset for Knowledge-Based Enrolment, Verification, and Identification
Knowledge-based authentication is crucial for task-oriented spoken dialogue systems that offer personalised and privacy-focused services.
TOD-DA: Towards Boosting the Robustness of Task-oriented Dialogue Modeling on Spoken Conversations
Task-oriented dialogue systems have been plagued by the difficulties of obtaining large-scale and high-quality annotated conversations.