Search Results for author: Mirco Ravanelli

Found 58 papers, 41 papers with code

Listenable Maps for Audio Classifiers

no code implementations • 19 Mar 2024 • Francesco Paissan, Mirco Ravanelli, Cem Subakan

Despite the impressive performance of deep learning models across diverse tasks, their complexity poses challenges for interpretation.

Paper
Add Code

SKILL: Similarity-aware Knowledge distILLation for Speech Self-Supervised Learning

no code implementations • 26 Feb 2024 • Luca Zampierin, Ghouthi Boukli Hacene, Bac Nguyen, Mirco Ravanelli

Self-supervised learning (SSL) has achieved remarkable success across various speech-processing tasks.

Knowledge Distillation Self-Supervised Learning

Paper
Add Code

Focal Modulation Networks for Interpretable Sound Classification

no code implementations • 5 Feb 2024 • Luca Della Libera, Cem Subakan, Mirco Ravanelli

The increasing success of deep neural networks has raised concerns about their inherent black-box nature, posing challenges related to interpretability and trust.

Classification Environmental Sound Classification +1

Paper
Add Code

Bayesian Deep Learning for Remaining Useful Life Estimation via Stein Variational Gradient Descent

1 code implementation • 2 Feb 2024 • Luca Della Libera, Jacopo Andreoli, Davide Dalle Pezze, Mirco Ravanelli, Gian Antonio Susto

In particular, we show through experimental studies on simulated run-to-failure turbofan engine degradation data that Bayesian deep learning models trained via Stein variational gradient descent consistently outperform with respect to convergence speed and predictive performance both the same models trained via parametric variational inference and their frequentist counterparts trained via backpropagation.

Variational Inference

Paper
Code

Are LLMs Robust for Spoken Dialogues?

no code implementations • 4 Jan 2024 • Seyed Mahed Mousavi, Gabriel Roccabruna, Simone Alghisi, Massimo Rizzoli, Mirco Ravanelli, Giuseppe Riccardi

Large Pre-Trained Language Models have demonstrated state-of-the-art performance in different downstream tasks, including dialogue state tracking and end-to-end response generation.

Dialogue State Tracking Response Generation

Paper
Add Code

Parameter-Efficient Transfer Learning of Audio Spectrogram Transformers

1 code implementation • 6 Dec 2023 • Umberto Cappellazzo, Daniele Falavigna, Alessio Brutti, Mirco Ravanelli

The common modus operandi of fine-tuning large pre-trained Transformer models entails the adaptation of all their parameters (i. e., full fine-tuning).

Audio Classification Few-Shot Learning +1

Paper
Code

TorchAudio 2.1: Advancing speech recognition, self-supervised learning, and audio processing components for PyTorch

1 code implementation • 27 Oct 2023 • Jeff Hwang, Moto Hira, Caroline Chen, Xiaohui Zhang, Zhaoheng Ni, Guangzhi Sun, Pingchuan Ma, Ruizhe Huang, Vineel Pratap, Yuekai Zhang, Anurag Kumar, Chin-Yun Yu, Chuang Zhu, Chunxi Liu, Jacob Kahn, Mirco Ravanelli, Peng Sun, Shinji Watanabe, Yangyang Shi, Yumeng Tao, Robin Scheibler, Samuele Cornell, Sean Kim, Stavros Petridis

TorchAudio is an open-source audio and speech processing library built for PyTorch.

Self-Supervised Learning Speech Enhancement +2

2,379

Paper
Code

CL-MASR: A Continual Learning Benchmark for Multilingual ASR

1 code implementation • 25 Oct 2023 • Luca Della Libera, Pooneh Mousavi, Salah Zaiem, Cem Subakan, Mirco Ravanelli

To the best of our knowledge, CL-MASR is the first continual learning benchmark for the multilingual ASR task.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Code

Audio Editing with Non-Rigid Text Prompts

no code implementations • 19 Oct 2023 • Francesco Paissan, Zhepei Wang, Mirco Ravanelli, Paris Smaragdis, Cem Subakan

We show that the proposed editing pipeline is able to create audio edits that remain faithful to the input audio.

Audio Generation Style Transfer

Paper
Add Code

Towards Foundational Models for Molecular Learning on Large-Scale Multi-Task Datasets

1 code implementation • 6 Oct 2023 • Dominique Beaini, Shenyang Huang, Joao Alex Cunha, Zhiyi Li, Gabriela Moisescu-Pareja, Oleksandr Dymov, Samuel Maddrell-Mander, Callum McLean, Frederik Wenkel, Luis Müller, Jama Hussein Mohamud, Ali Parviz, Michael Craig, Michał Koziarski, Jiarui Lu, Zhaocheng Zhu, Cristian Gabellini, Kerstin Klaser, Josef Dean, Cas Wognum, Maciej Sypetkowski, Guillaume Rabusseau, Reihaneh Rabbany, Jian Tang, Christopher Morris, Ioannis Koutis, Mirco Ravanelli, Guy Wolf, Prudencio Tossou, Hadrien Mary, Therence Bois, Andrew Fitzgibbon, Błażej Banaszewski, Chad Martin, Dominic Masters

Recently, pre-trained foundation models have enabled significant advancements in multiple fields.

154

Paper
Code

Speech Self-Supervised Representations Benchmarking: a Case for Larger Probing Heads

no code implementations • 28 Aug 2023 • Salah Zaiem, Youcef Kemiche, Titouan Parcollet, Slim Essid, Mirco Ravanelli

Self-supervised learning (SSL) leverages large datasets of unlabeled speech to reach impressive performance with reduced amounts of annotated data.

Benchmarking Self-Supervised Learning

Paper
Add Code

Generalization Limits of Graph Neural Networks in Identity Effects Learning

1 code implementation • 30 Jun 2023 • Giuseppe Alessio D'Inverno, Simone Brugiapaglia, Mirco Ravanelli

They are usually based on a message-passing mechanism and have gained increasing popularity for their intuitive formulation, which is closely linked to the Weisfeiler-Lehman (WL) test for graph isomorphism to which they have been proven equivalent in terms of expressive power.

Paper
Code

Speech Emotion Diarization: Which Emotion Appears When?

2 code implementations • 22 Jun 2023 • Yingzhi Wang, Mirco Ravanelli, Alya Yacoubi

Speech Emotion Recognition (SER) typically relies on utterance-level solutions.

speaker-diarization Speaker Diarization +1

7,856

Paper
Code

RescueSpeech: A German Corpus for Speech Recognition in Search and Rescue Domain

no code implementations • 6 Jun 2023 • Sangeet Sagar, Mirco Ravanelli, Bernd Kiefer, Ivana Kruijff Korbayova, Josef van Genabith

Despite the recent advancements in speech recognition, there are still difficulties in accurately transcribing conversational and emotional speech in noisy and reverberant acoustic environments.

Decision Making Robust Speech Recognition +1

Paper
Add Code

Speech Self-Supervised Representation Benchmarking: Are We Doing it Right?

1 code implementation • 1 Jun 2023 • Salah Zaiem, Youcef Kemiche, Titouan Parcollet, Slim Essid, Mirco Ravanelli

Self-supervised learning (SSL) has recently allowed leveraging large datasets of unlabeled speech signals to reach impressive performance on speech tasks using only small amounts of annotated data.

Benchmarking Self-Supervised Learning

Paper
Code

Simulated Annealing in Early Layers Leads to Better Generalization

1 code implementation • CVPR 2023 • AmirMohammad Sarfi, Zahra Karimpour, Muawiz Chaudhary, Nasir M. Khalid, Mirco Ravanelli, Sudhir Mudur, Eugene Belilovsky

Our principal innovation in this work is to use Simulated annealing in EArly Layers (SEAL) of the network in place of re-initialization of later layers.

Few-Shot Learning Transfer Learning

Paper
Code

Posthoc Interpretation via Quantization

1 code implementation • 22 Mar 2023 • Francesco Paissan, Cem Subakan, Mirco Ravanelli

In this paper, we introduce a new approach, called Posthoc Interpretation via Quantization (PIQ), for interpreting decisions made by trained classifiers.

Image Segmentation Quantization +1

7,856

Paper
Code

Fine-tuning Strategies for Faster Inference using Speech Self-Supervised Models: A Comparative Study

1 code implementation • 12 Mar 2023 • Salah Zaiem, Robin Algayres, Titouan Parcollet, Slim Essid, Mirco Ravanelli

Self-supervised learning (SSL) has allowed substantial progress in Automatic Speech Recognition (ASR) performance in low-resource settings.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Code

SoundChoice: Grapheme-to-Phoneme Models with Semantic Disambiguation

1 code implementation • 27 Jul 2022 • Artem Ploujnikov, Mirco Ravanelli

End-to-end speech synthesis models directly convert the input characters into an audio representation (e. g., spectrograms).

Language Modelling Multi-Task Learning +5

7,856

Paper
Code

Resource-Efficient Separation Transformer

1 code implementation • 19 Jun 2022 • Luca Della Libera, Cem Subakan, Mirco Ravanelli, Samuele Cornell, Frédéric Lepoutre, François Grondin

Transformers have recently achieved state-of-the-art performance in speech separation.

Speech Separation

7,856

Paper
Code

Learning Representations for New Sound Classes With Continual Self-Supervised Learning

1 code implementation • 15 May 2022 • Zhepei Wang, Cem Subakan, Xilin Jiang, Junkai Wu, Efthymios Tzinis, Mirco Ravanelli, Paris Smaragdis

In this paper, we work on a sound recognition system that continually incorporates new sound classes.

Continual Learning Representation Learning +1

Paper
Code

Exploring Self-Attention Mechanisms for Speech Separation

1 code implementation • 6 Feb 2022 • Cem Subakan, Mirco Ravanelli, Samuele Cornell, Francois Grondin, Mirko Bronzi

In particular, we extend our previous findings on the SepFormer by providing results on more challenging noisy and noisy-reverberant datasets, such as LibriMix, WHAM!, and WHAMR!.

Ranked #1 on Speech Enhancement on WHAM!

Denoising Speech Enhancement +1

7,856

Paper
Code

OSSEM: one-shot speaker adaptive speech enhancement using meta learning

no code implementations • 10 Nov 2021 • Cheng Yu, Szu-Wei Fu, Tsun-An Hsieh, Yu Tsao, Mirco Ravanelli

Although deep learning (DL) has achieved notable progress in speech enhancement (SE), further research is still required for a DL-based SE system to adapt effectively and efficiently to particular speakers.

Meta-Learning Speech Enhancement

Paper
Add Code

REAL-M: Towards Speech Separation on Real Mixtures

1 code implementation • 20 Oct 2021 • Cem Subakan, Mirco Ravanelli, Samuele Cornell, François Grondin

First, we release the REAL-M dataset, a crowd-sourced corpus of real-life mixtures.

Open-Ended Question Answering Speech Separation

7,857

Paper
Code

MetricGAN-U: Unsupervised speech enhancement/ dereverberation based only on noisy/ reverberated speech

2 code implementations • 12 Oct 2021 • Szu-Wei Fu, Cheng Yu, Kuo-Hsuan Hung, Mirco Ravanelli, Yu Tsao

Most of the deep learning-based speech enhancement models are learned in a supervised manner, which implies that pairs of noisy and clean speech are required during training.

Speech Enhancement

7,856

Paper
Code

Interpretable SincNet-based Deep Learning for Emotion Recognition from EEG brain activity

1 code implementation • 18 Jul 2021 • Juan Manuel Mayor-Torres, Mirco Ravanelli, Sara E. Medina-DeVilliers, Matthew D. Lerner, Giuseppe Riccardi

This result is consistent with recent neuroscience studies on emotion recognition, which found an association between these band suppressions and the behavioral deficits observed in individuals with ASD.

EEG Emotion Recognition

Paper
Code

SpeechBrain: A General-Purpose Speech Toolkit

4 code implementations • 8 Jun 2021 • Mirco Ravanelli, Titouan Parcollet, Peter Plantinga, Aku Rouhe, Samuele Cornell, Loren Lugosch, Cem Subakan, Nauman Dawalatabad, Abdelwahab Heba, Jianyuan Zhong, Ju-chieh Chou, Sung-Lin Yeh, Szu-Wei Fu, Chien-Feng Liao, Elena Rastorgueva, François Grondin, William Aris, Hwidong Na, Yan Gao, Renato de Mori, Yoshua Bengio

SpeechBrain is an open-source and all-in-one speech toolkit.

Language Identification Spoken Language Understanding

7,856

Paper
Code

MetricGAN+: An Improved Version of MetricGAN for Speech Enhancement

3 code implementations • 8 Apr 2021 • Szu-Wei Fu, Cheng Yu, Tsun-An Hsieh, Peter Plantinga, Mirco Ravanelli, Xugang Lu, Yu Tsao

The discrepancy between the cost function used for training a speech enhancement model and human auditory perception usually makes the quality of enhanced speech unsatisfactory.

Ranked #12 on Speech Enhancement on VoiceBank + DEMAND

Speech Enhancement

7,856

Paper
Code

Timers and Such: A Practical Benchmark for Spoken Language Understanding with Numbers

2 code implementations • 4 Apr 2021 • Loren Lugosch, Piyush Papreja, Mirco Ravanelli, Abdelwahab Heba, Titouan Parcollet

This paper introduces Timers and Such, a new open source dataset of spoken English commands for common voice control use cases involving numbers.

Ranked #4 on Spoken Language Understanding on Timers and Such (using extra training data)

Spoken Language Understanding

7,856

Paper
Code

ECAPA-TDNN Embeddings for Speaker Diarization

no code implementations • 3 Apr 2021 • Nauman Dawalatabad, Mirco Ravanelli, François Grondin, Jenthe Thienpondt, Brecht Desplanques, Hwidong Na

Learning robust speaker embeddings is a crucial step in speaker diarization.

speaker-diarization Speaker Diarization +1

Paper
Add Code

Transformers with Competitive Ensembles of Independent Mechanisms

no code implementations • 27 Feb 2021 • Alex Lamb, Di He, Anirudh Goyal, Guolin Ke, Chien-Feng Liao, Mirco Ravanelli, Yoshua Bengio

In this work we explore a way in which the Transformer architecture is deficient: it represents each position with a large monolithic hidden representation and a single set of parameters which are applied over the entire hidden representation.

Speech Enhancement

Paper
Add Code

Attention is All You Need in Speech Separation

4 code implementations • 25 Oct 2020 • Cem Subakan, Mirco Ravanelli, Samuele Cornell, Mirko Bronzi, Jianyuan Zhong

Transformers are emerging as a natural alternative to standard RNNs, replacing recurrent computations with a multi-head attention mechanism.

Ranked #7 on Speech Separation on WSJ0-3mix

Speech Separation

7,856

Paper
Code

BIRD: Big Impulse Response Dataset

1 code implementation • 19 Oct 2020 • François Grondin, Jean-Samuel Lauzon, Simon Michaud, Mirco Ravanelli, François Michaud

This paper introduces BIRD, the Big Impulse Response Dataset.

Sound Audio and Speech Processing

131

Paper
Code

Quaternion Neural Networks for Multi-channel Distant Speech Recognition

1 code implementation • 18 May 2020 • Xinchi Qiu, Titouan Parcollet, Mirco Ravanelli, Nicholas Lane, Mohamed Morchid

In this paper, we propose to capture these inter- and intra- structural dependencies with quaternion neural networks, which can jointly process multiple signals as whole quaternion entities.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

2,351

Paper
Code

Multi-task self-supervised learning for Robust Speech Recognition

1 code implementation • 25 Jan 2020 • Mirco Ravanelli, Jianyuan Zhong, Santiago Pascual, Pawel Swietojanski, Joao Monteiro, Jan Trmal, Yoshua Bengio

We then propose a revised encoder that better learns short- and long-term speech dynamics with an efficient combination of recurrent and convolutional networks.

Robust Speech Recognition Self-Supervised Learning +1

434

Paper
Code

Using Speech Synthesis to Train End-to-End Spoken Language Understanding Models

2 code implementations • 21 Oct 2019 • Loren Lugosch, Brett Meyer, Derek Nowrouzezahrai, Mirco Ravanelli

End-to-end models are an attractive new approach to spoken language understanding (SLU) in which the meaning of an utterance is inferred directly from the raw audio without employing the standard pipeline composed of a separately trained speech recognizer and natural language understanding module.

Ranked #7 on Spoken Language Understanding on Snips-SmartLights

Data Augmentation Natural Language Understanding +2

220

Paper
Code

Retrieving Signals in the Frequency Domain with Deep Complex Extractors

1 code implementation • 25 Sep 2019 • Chiheb Trabelsi, Olexa Bilaniuk, Ousmane Dia, Ying Zhang, Mirco Ravanelli, Jonathan Binas, Negar Rostamzadeh, Christopher J Pal

Using the Wall Street Journal Dataset, we compare our phase-aware loss to several others that operate both in the time and frequency domains and demonstrate the effectiveness of our proposed signal extraction method and proposed loss.

Audio Source Separation

Paper
Code

Retrieving Signals with Deep Complex Extractors

no code implementations • NeurIPS Workshop Deep_Invers 2019 • Chiheb Trabelsi, Olexa Bilaniuk, Ousmane Dia, Ying Zhang, Mirco Ravanelli, Jonathan Binas, Negar Rostamzadeh, Christopher J Pal

Building on recent advances, we propose a new deep complex-valued method for signal retrieval and extraction in the frequency domain.

Audio Source Separation Retrieval

Paper
Add Code

Speech Model Pre-training for End-to-End Spoken Language Understanding

1 code implementation • 7 Apr 2019 • Loren Lugosch, Mirco Ravanelli, Patrick Ignoto, Vikrant Singh Tomar, Yoshua Bengio

Whereas conventional spoken language understanding (SLU) systems map speech to text, and then text to intent, end-to-end SLU systems map speech directly to intent through a single trainable model.

Ranked #15 on Spoken Language Understanding on Fluent Speech Commands (using extra training data)

Spoken Language Understanding

220

Paper
Code

Learning Problem-agnostic Speech Representations from Multiple Self-supervised Tasks

1 code implementation • 6 Apr 2019 • Santiago Pascual, Mirco Ravanelli, Joan Serrà, Antonio Bonafonte, Yoshua Bengio

Learning good representations without supervision is still an open issue in machine learning, and is particularly challenging for speech signals, which are often characterized by long sequences with a complex hierarchical structure.

Ranked #2 on Distant Speech Recognition on DIRHA English WSJ

Distant Speech Recognition

434

Paper
Code

Speech and Speaker Recognition from Raw Waveform with SincNet

2 code implementations • 13 Dec 2018 • Mirco Ravanelli, Yoshua Bengio

Deep neural networks can learn complex and abstract representations, that are progressively obtained by combining simpler ones.

Inductive Bias Speaker Recognition +2

1,097

Paper
Code

Learning Speaker Representations with Mutual Information

2 code implementations • 1 Dec 2018 • Mirco Ravanelli, Yoshua Bengio

Mutual Information (MI) or similar measures of statistical dependence are promising tools for learning these representations in an unsupervised way.

Sentence Speaker Identification

Paper
Code

Interpretable Convolutional Filters with SincNet

1 code implementation • 23 Nov 2018 • Mirco Ravanelli, Yoshua Bengio

Deep learning is currently playing a crucial role toward higher levels of artificial intelligence.

Ranked #3 on Distant Speech Recognition on DIRHA English WSJ

Distant Speech Recognition Inductive Bias +1

2,351

Paper
Code

Speech recognition with quaternion neural networks

no code implementations • 21 Nov 2018 • Titouan Parcollet, Mirco Ravanelli, Mohamed Morchid, Georges Linarès, Renato de Mori

Neural network architectures are at the core of powerful automatic speech recognition systems (ASR).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

The PyTorch-Kaldi Speech Recognition Toolkit

11 code implementations • 19 Nov 2018 • Mirco Ravanelli, Titouan Parcollet, Yoshua Bengio

Experiments, that are conducted on several datasets and tasks, show that PyTorch-Kaldi can effectively be used to develop modern state-of-the-art speech recognizers.

Ranked #1 on Distant Speech Recognition on DIRHA English WSJ

Distant Speech Recognition Noisy Speech Recognition

2,351

Paper
Code

Speaker Recognition from Raw Waveform with SincNet

26 code implementations • 29 Jul 2018 • Mirco Ravanelli, Yoshua Bengio

Rather than employing standard hand-crafted features, the latter CNNs learn low-level speech representations from waveforms, potentially allowing the network to better capture important narrow-band speaker characteristics such as pitch and formants.

Speaker Identification Speaker Recognition +1

1,097

Paper
Code

Quaternion Recurrent Neural Networks

3 code implementations • ICLR 2019 • Titouan Parcollet, Mirco Ravanelli, Mohamed Morchid, Georges Linarès, Chiheb Trabelsi, Renato de Mori, Yoshua Bengio

Recurrent neural networks (RNNs) are powerful architectures to model sequential data, due to their capability to learn short and long-term dependencies between the basic elements of a sequence.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

2,351

Paper
Code

Automatic context window composition for distant speech recognition

no code implementations • 26 May 2018 • Mirco Ravanelli, Maurizio Omologo

Distant speech recognition is being revolutionized by deep learning, that has contributed to significantly outperform previous HMM-GMM systems.

Distant Speech Recognition speech-recognition

Paper
Add Code

Twin Regularization for online speech recognition

2 code implementations • 15 Apr 2018 • Mirco Ravanelli, Dmitriy Serdyuk, Yoshua Bengio

Online speech recognition is crucial for developing natural human-machine interfaces.

speech-recognition Speech Recognition

2,351

Paper
Code

Light Gated Recurrent Units for Speech Recognition

1 code implementation • 26 Mar 2018 • Mirco Ravanelli, Philemon Brakel, Maurizio Omologo, Yoshua Bengio

A field that has directly benefited from the recent advances in deep learning is Automatic Speech Recognition (ASR).

Ranked #6 on Speech Recognition on TIMIT

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Code

Deep Learning for Distant Speech Recognition

no code implementations • 17 Dec 2017 • Mirco Ravanelli

Deep learning is an emerging technology that is considered one of the most promising directions for reaching higher levels of artificial intelligence.

Distant Speech Recognition speech-recognition

Paper
Add Code

Realistic multi-microphone data simulation for distant speech recognition

1 code implementation • 26 Nov 2017 • Mirco Ravanelli, Piergiorgio Svaizer, Maurizio Omologo

The availability of realistic simulated corpora is of key importance for the future progress of distant speech recognition technology.

Audio and Speech Processing Sound

Paper
Code

Contaminated speech training methods for robust DNN-HMM distant speech recognition

1 code implementation • 10 Oct 2017 • Mirco Ravanelli, Maurizio Omologo

Despite the significant progress made in the last years, state-of-the-art speech recognition technologies provide a satisfactory performance only in the close-talking condition.

Distant Speech Recognition Speech Enhancement +1

Paper
Code

The DIRHA-English corpus and related tasks for distant-speech recognition in domestic environments

2 code implementations • 6 Oct 2017 • Mirco Ravanelli, Maurizio Omologo

This paper introduces the contents and the possible usage of the DIRHA-ENGLISH multi-microphone corpus, recently realized under the EC DIRHA project.

Distant Speech Recognition speech-recognition

Paper
Code

Improving speech recognition by revising gated recurrent units

1 code implementation • 29 Sep 2017 • Mirco Ravanelli, Philemon Brakel, Maurizio Omologo, Yoshua Bengio

First, we suggest to remove the reset gate in the GRU design, resulting in a more efficient single-gate architecture.

speech-recognition Speech Recognition

Paper
Code

Batch-normalized joint training for DNN-based distant speech recognition

no code implementations • 24 Mar 2017 • Mirco Ravanelli, Philemon Brakel, Maurizio Omologo, Yoshua Bengio

Improving distant speech recognition is a crucial step towards flexible human-machine interfaces.

Distant Speech Recognition Speech Enhancement +1

Paper
Add Code

A network of deep neural networks for distant speech recognition

no code implementations • 23 Mar 2017 • Mirco Ravanelli, Philemon Brakel, Maurizio Omologo, Yoshua Bengio

Despite the remarkable progress recently made in distant speech recognition, state-of-the-art technology still suffers from a lack of robustness, especially when adverse acoustic conditions characterized by non-stationary noises and reverberation are met.

Distant Speech Recognition Speech Enhancement +1

Paper
Add Code

The DIRHA simulated corpus

no code implementations • LREC 2014 • Luca Cristoforetti, Mirco Ravanelli, Maurizio Omologo, Aless Sosi, ro, Alberto Abad, Martin Hagmueller, Petros Maragos

This paper describes a multi-microphone multi-language acoustic corpus being developed under the EC project Distant-speech Interaction for Robust Home Applications (DIRHA).

Dialogue Management Distant Speech Recognition +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.