1 code implementation • 28 Jan 2024 • Desh Raj, Matthew Wiesner, Matthew Maciejewski, Leibny Paola Garcia-Perera, Daniel Povey, Sanjeev Khudanpur
The Streaming Unmixing and Recognition Transducer (SURT) has recently become a popular framework for continuous, streaming, multi-talker speech recognition (ASR).
1 code implementation • 3 Nov 2020 • Desh Raj, Leibny Paola Garcia-Perera, Zili Huang, Shinji Watanabe, Daniel Povey, Andreas Stolcke, Sanjeev Khudanpur
Several advances have been made recently towards handling overlapping speech for speaker diarization.
Audio and Speech Processing Sound
no code implementations • 13 Jul 2020 • Carlos Rodrigo Castillo-Sanchez, Leibny Paola Garcia-Perera, Anabel Martin-Gonzalez
Using these models to identify the instances in which these speakers intervene in a recording is the task of speaker tracking.
1 code implementation • 23 Oct 2019 • Marvin Lavechin, Marie-Philippe Gill, Ruben Bousbib, Hervé Bredin, Leibny Paola Garcia-Perera
In the in-domain scenario where the training and test sets cover the exact same domains, we show that the domain-adversarial approach does not degrade performance of the proposed end-to-end model.
Audio and Speech Processing I.2.7
no code implementations • 6 Nov 2018 • Matthew Maciejewski, Gregory Sell, Leibny Paola Garcia-Perera, Shinji Watanabe, Sanjeev Khudanpur
To date, the bulk of research on single-channel speech separation has been conducted using clean, near-field, read speech, which is not representative of many modern applications.