no code implementations • 29 May 2023 • Luca Serafini, Samuele Cornell, Giovanni Morrone, Enrico Zovato, Alessio Brutti, Stefano Squartini
We found that, among all methods considered, EEND-vector clustering (EEND-VC) offers the best trade-off in terms of computing requirements and performance.
no code implementations • 21 Mar 2023 • Giovanni Morrone, Samuele Cornell, Luca Serafini, Enrico Zovato, Alessio Brutti, Stefano Squartini
Finally, we also show that the separated signals can be readily used also for automatic speech recognition, reaching performance close to using oracle sources in some configurations.
no code implementations • 31 May 2022 • Giovanni Morrone, Samuele Cornell, Enrico Zovato, Alessio Brutti, Stefano Squartini
Continuous speech separation (CSS) is a recently proposed framework which aims at separating each speaker from an input mixture signal in a streaming fashion.
1 code implementation • 5 Apr 2022 • Giovanni Morrone, Samuele Cornell, Desh Raj, Luca Serafini, Enrico Zovato, Alessio Brutti, Stefano Squartini
In particular, we compare two low-latency speech separation models.
no code implementations • 9 Oct 2020 • Giovanni Morrone, Daniel Michelsanti, Zheng-Hua Tan, Jesper Jensen
In this paper, we present a deep-learning-based framework for audio-visual speech inpainting, i. e., the task of restoring the missing parts of an acoustic speech signal from reliable audio context and uncorrupted visual information.
no code implementations • 5 Dec 2019 • Ander Arriandiaga, Giovanni Morrone, Luca Pasa, Leonardo Badino, Chiara Bartolozzi
In order to overcome this limitation, we propose the use of event-driven cameras and exploit compression, high temporal resolution and low latency, for low cost and low latency motion feature extraction, going towards online embedded audio-visual speech processing.
no code implementations • 16 Apr 2019 • Luca Pasa, Giovanni Morrone, Leonardo Badino
In this paper, we analyzed how audio-visual speech enhancement can help to perform the ASR task in a cocktail party scenario.
1 code implementation • 6 Nov 2018 • Giovanni Morrone, Luca Pasa, Vadim Tikhanoff, Sonia Bergamaschi, Luciano Fadiga, Leonardo Badino
In this paper, we address the problem of enhancing the speech of a speaker of interest in a cocktail party scenario when visual information of the speaker of interest is available.
Ranked #1 on Speech Enhancement on GRID corpus (mixed-speech)