Search Results for author: Erik Ekstedt

Found 9 papers, 4 papers with code

Projection of Turn Completion in Incremental Spoken Dialogue Systems

no code implementations SIGDIAL (ACL) 2021 Erik Ekstedt, Gabriel Skantze

The ability to take turns in a fluent way (i. e., without long response delays or frequent interruptions) is a fundamental aspect of any spoken dialog system.

Language Modelling speech-recognition +2

Multilingual Turn-taking Prediction Using Voice Activity Projection

no code implementations11 Mar 2024 Koji Inoue, Bing'er Jiang, Erik Ekstedt, Tatsuya Kawahara, Gabriel Skantze

The results show that a monolingual VAP model trained on one language does not make good predictions when applied to other languages.

Real-time and Continuous Turn-taking Prediction Using Voice Activity Projection

1 code implementation10 Jan 2024 Koji Inoue, Bing'er Jiang, Erik Ekstedt, Tatsuya Kawahara, Gabriel Skantze

A demonstration of a real-time and continuous turn-taking prediction system is presented.

Automatic Evaluation of Turn-taking Cues in Conversational Speech Synthesis

no code implementations29 May 2023 Erik Ekstedt, Siyang Wang, Éva Székely, Joakim Gustafson, Gabriel Skantze

Turn-taking is a fundamental aspect of human communication where speakers convey their intention to either hold, or yield, their turn through prosodic cues.

Speech Synthesis

Response-conditioned Turn-taking Prediction

no code implementations3 May 2023 Bing'er Jiang, Erik Ekstedt, Gabriel Skantze

Treating the turn-prediction and response-ranking as a one-stage process, our findings suggest that our model can be used as an incremental response ranker, which can be applied in various settings.

Response Generation

What makes a good pause? Investigating the turn-holding effects of fillers

no code implementations3 May 2023 Bing'er Jiang, Erik Ekstedt, Gabriel Skantze

Filled pauses (or fillers), such as "uh" and "um", are frequent in spontaneous speech and can serve as a turn-holding cue for the listener, indicating that the current speaker is not done yet.

Position

How Much Does Prosody Help Turn-taking? Investigations using Voice Activity Projection Models

2 code implementations SIGDIAL (ACL) 2022 Erik Ekstedt, Gabriel Skantze

Turn-taking is a fundamental aspect of human communication and can be described as the ability to take turns, project upcoming turn shifts, and supply backchannels at appropriate locations throughout a conversation.

Voice Activity Projection: Self-supervised Learning of Turn-taking Events

3 code implementations19 May 2022 Erik Ekstedt, Gabriel Skantze

The modeling of turn-taking in dialog can be viewed as the modeling of the dynamics of voice activity of the interlocutors.

Self-Supervised Learning

TurnGPT: a Transformer-based Language Model for Predicting Turn-taking in Spoken Dialog

1 code implementation Findings of the Association for Computational Linguistics 2020 Erik Ekstedt, Gabriel Skantze

Syntactic and pragmatic completeness is known to be important for turn-taking prediction, but so far machine learning models of turn-taking have used such linguistic information in a limited way.

Language Modelling

Cannot find the paper you are looking for? You can Submit a new open access paper.