no code implementations • LREC 2022 • Carol Figueroa, Adaeze Adigwe, Magalie Ochs, Gabriel Skantze
There has been a lot of work on predicting the timing of feedback in conversational systems.
no code implementations • SIGDIAL (ACL) 2021 • A. Seza Doğruöz, Gabriel Skantze
To clarify the boundaries of “openness”, we conduct two studies: First, we classify the types of “speech events” encountered in a chatbot evaluation data set (i. e., Meena by Google) and find that these conversations mainly cover the “small talk” category and exclude the other speech event categories encountered in real life human-human communication.
no code implementations • SIGDIAL (ACL) 2021 • Erik Ekstedt, Gabriel Skantze
The ability to take turns in a fluent way (i. e., without long response delays or frequent interruptions) is a fundamental aspect of any spoken dialog system.
no code implementations • 11 Mar 2024 • Koji Inoue, Bing'er Jiang, Erik Ekstedt, Tatsuya Kawahara, Gabriel Skantze
The results show that a monolingual VAP model trained on one language does not make good predictions when applied to other languages.
no code implementations • 10 Jan 2024 • Koji Inoue, Divesh Lala, Keiko Ochi, Tatsuya Kawahara, Gabriel Skantze
To address this issue, we propose a framework for indirectly but objectively evaluating systems based on users' behaviors.
1 code implementation • 10 Jan 2024 • Koji Inoue, Bing'er Jiang, Erik Ekstedt, Tatsuya Kawahara, Gabriel Skantze
A demonstration of a real-time and continuous turn-taking prediction system is presented.
1 code implementation • 23 Sep 2023 • Bram Willemsen, Livia Qian, Gabriel Skantze
Vision-language models (VLMs) have shown to be effective at image retrieval based on simple text queries, but text-image retrieval based on conversational input remains a challenge.
1 code implementation • LREC 2022 • Bram Willemsen, Dmytro Kalpakchi, Gabriel Skantze
We address these concerns by introducing a collaborative image ranking task, a grounded agreement game we call "A Game Of Sorts".
no code implementations • 21 Aug 2023 • Koji Inoue, Divesh Lala, Keiko Ochi, Tatsuya Kawahara, Gabriel Skantze
This paper tackles the challenging task of evaluating socially situated conversational robots and presents a novel objective evaluation approach that relies on multimodal user behaviors.
1 code implementation • 14 Jul 2023 • Agnes Axelsson, Gabriel Skantze
In any system that uses structured knowledge graph (KG) data as its underlying knowledge representation, KG-to-text generation is a useful tool for turning parts of the graph data into text that can be understood by humans.
no code implementations • 29 May 2023 • Erik Ekstedt, Siyang Wang, Éva Székely, Joakim Gustafson, Gabriel Skantze
Turn-taking is a fundamental aspect of human communication where speakers convey their intention to either hold, or yield, their turn through prosodic cues.
no code implementations • 3 May 2023 • Bing'er Jiang, Erik Ekstedt, Gabriel Skantze
Treating the turn-prediction and response-ranking as a one-stage process, our findings suggest that our model can be used as an incremental response ranker, which can be applied in various settings.
no code implementations • 3 May 2023 • Bing'er Jiang, Erik Ekstedt, Gabriel Skantze
Filled pauses (or fillers), such as "uh" and "um", are frequent in spontaneous speech and can serve as a turn-holding cue for the listener, indicating that the current speaker is not done yet.
no code implementations • 21 Mar 2023 • Gabriel Skantze, A. Seza Doğruöz
There is a surge in interest in the development of open-domain chatbots, driven by the recent advancements of large language models.
no code implementations • 24 Nov 2022 • A. Seza Doğruöz, Gabriel Skantze
To clarify the boundaries of "openness", we conduct two studies: First, we classify the types of "speech events" encountered in a chatbot evaluation data set (i. e., Meena by Google) and find that these conversations mainly cover the "small talk" category and exclude the other speech event categories encountered in real life human-human communication.
2 code implementations • SIGDIAL (ACL) 2022 • Erik Ekstedt, Gabriel Skantze
Turn-taking is a fundamental aspect of human communication and can be described as the ability to take turns, project upcoming turn shifts, and supply backchannels at appropriate locations throughout a conversation.
3 code implementations • 19 May 2022 • Erik Ekstedt, Gabriel Skantze
The modeling of turn-taking in dialog can be viewed as the modeling of the dynamics of voice activity of the interlocutors.
1 code implementation • 15 Nov 2021 • Gabriel Skantze, Bram Willemsen
This paper presents CoLLIE: a simple, yet effective model for continual learning of how language is grounded in vision.
1 code implementation • Findings of the Association for Computational Linguistics 2020 • Erik Ekstedt, Gabriel Skantze
Syntactic and pragmatic completeness is known to be important for turn-taking prediction, but so far machine learning models of turn-taking have used such linguistic information in a limited way.
no code implementations • WS 2019 • Nils Axelsson, Gabriel Skantze
In dialogue, speakers continuously adapt their speech to accommodate the listener, based on the feedback they receive.
no code implementations • EMNLP 2018 • Todd Shore, Gabriel Skantze
Referring to entities in situated dialog is a collaborative process, whereby interlocutors often expand, repair and/or replace referring expressions in an iterative process, converging on conceptual pacts of referring language use in doing so.
1 code implementation • 31 Aug 2018 • Matthew Roddy, Gabriel Skantze, Naomi Harte
To design spoken dialog systems that can conduct fluid interactions it is desirable to incorporate cues from separate modalities into turn-taking models.
1 code implementation • 29 Jun 2018 • Matthew Roddy, Gabriel Skantze, Naomi Harte
The continuous predictions represent generalized turn-taking behaviors observed in the training data and can be applied to make decisions that are not just limited to end-of-turn detection.
no code implementations • WS 2017 • Gabriel Skantze
Previous models of turn-taking have mostly been trained for specific turn-taking decisions, such as discriminating between turn shifts and turn retention in pauses.