The WSJ0 Hipster Ambient Mixtures (WHAM!) dataset pairs each two-speaker mixture in the wsj0-2mix dataset with a unique noise background scene. It has an extension called WHAMR! that adds artificial reverberation to the speech signals in addition to the background noise.
81 PAPERS • 5 BENCHMARKS
WHAMR! is a dataset for noisy and reverberant speech separation. It extends WHAM! by introducing synthetic reverberation to the speech sources in addition to the existing noise. Room impulse responses were generated and convolved using pyroomacoustics. Reverberation times were chosen to approximate domestic and classroom environments (expected to be similar to the restaurants and coffee shops where the WHAM! noise was collected), and further classified as high, medium, and low reverberation based on a qualitative assessment of the mixture’s noise recording.
45 PAPERS • 3 BENCHMARKS
The DNS Challenge at INTERSPEECH 2020 intended to promote collaborative research in single-channel Speech Enhancement aimed to maximize the perceptual quality and intelligibility of the enhanced speech. The challenge evaluated the speech quality using the online subjective evaluation framework ITU-T P.808. The challenge provides large datasets for training noise suppressors.
42 PAPERS • 3 BENCHMARKS
Noiseless reverberant dataset using the public WSJ0 corpus and simulated room impulse responses using the PyRoomAcoustics library. Used in: - Speech Enhancement and Dereverberation with Diffusion-based Generative Models, Richter et al., arXiv 2022 - StoRM: A Stochastic Regeneration Model for Speech Enhancement and Dereverberation, Lemercier et al., arXiv 2022 - Analysing Discriminative versus Diffusion-based Generative Models for Speech Restoration, Lemercier et al., ICASSP 2023
2 PAPERS • NO BENCHMARKS YET
WHAMR_ext is an extension to the WHAMR corpus with larger RT60 values (between 1s and 3s)
1 PAPER • 1 BENCHMARK