Search Results for author: Giovanni Morrone

Found 8 papers, 2 papers with code

An Experimental Review of Speaker Diarization methods with application to Two-Speaker Conversational Telephone Speech recordings

no code implementations • 29 May 2023 • Luca Serafini, Samuele Cornell, Giovanni Morrone, Enrico Zovato, Alessio Brutti, Stefano Squartini

We found that, among all methods considered, EEND-vector clustering (EEND-VC) offers the best trade-off in terms of computing requirements and performance.

Clustering speaker-diarization +4

Paper
Add Code

End-to-End Integration of Speech Separation and Voice Activity Detection for Low-Latency Diarization of Telephone Conversations

no code implementations • 21 Mar 2023 • Giovanni Morrone, Samuele Cornell, Luca Serafini, Enrico Zovato, Alessio Brutti, Stefano Squartini

Finally, we also show that the separated signals can be readily used also for automatic speech recognition, reaching performance close to using oracle sources in some configurations.

Action Detection Activity Detection +4

Paper
Add Code

Conversational Speech Separation: an Evaluation Study for Streaming Applications

no code implementations • 31 May 2022 • Giovanni Morrone, Samuele Cornell, Enrico Zovato, Alessio Brutti, Stefano Squartini

Continuous speech separation (CSS) is a recently proposed framework which aims at separating each speaker from an input mixture signal in a streaming fashion.

Speech Separation

Paper
Add Code

Low-Latency Speech Separation Guided Diarization for Telephone Conversations

1 code implementation • 5 Apr 2022 • Giovanni Morrone, Samuele Cornell, Desh Raj, Luca Serafini, Enrico Zovato, Alessio Brutti, Stefano Squartini

In particular, we compare two low-latency speech separation models.

Action Detection Activity Detection +5

Paper
Code

Audio-Visual Speech Inpainting with Deep Learning

no code implementations • 9 Oct 2020 • Giovanni Morrone, Daniel Michelsanti, Zheng-Hua Tan, Jesper Jensen

In this paper, we present a deep-learning-based framework for audio-visual speech inpainting, i. e., the task of restoring the missing parts of an acoustic speech signal from reliable audio context and uncorrupted visual information.

Multi-Task Learning

Paper
Add Code

Audio-Visual Target Speaker Enhancement on Multi-Talker Environment using Event-Driven Cameras

no code implementations • 5 Dec 2019 • Ander Arriandiaga, Giovanni Morrone, Luca Pasa, Leonardo Badino, Chiara Bartolozzi

In order to overcome this limitation, we propose the use of event-driven cameras and exploit compression, high temporal resolution and low latency, for low cost and low latency motion feature extraction, going towards online embedded audio-visual speech processing.

Optical Flow Estimation Speech Separation

Paper
Add Code

An Analysis of Speech Enhancement and Recognition Losses in Limited Resources Multi-talker Single Channel Audio-Visual ASR

no code implementations • 16 Apr 2019 • Luca Pasa, Giovanni Morrone, Leonardo Badino

In this paper, we analyzed how audio-visual speech enhancement can help to perform the ASR task in a cocktail party scenario.

Speech Enhancement

Paper
Add Code

Face Landmark-based Speaker-Independent Audio-Visual Speech Enhancement in Multi-Talker Environments

1 code implementation • 6 Nov 2018 • Giovanni Morrone, Luca Pasa, Vadim Tikhanoff, Sonia Bergamaschi, Luciano Fadiga, Leonardo Badino

In this paper, we address the problem of enhancing the speech of a speaker of interest in a cocktail party scenario when visual information of the speaker of interest is available.

Ranked #1 on Speech Enhancement on GRID corpus (mixed-speech)

Speech Enhancement Speech Separation

101

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.