Speech Dereverberation
16 papers with code • 4 benchmarks • 5 datasets
Removing reverberation from audio signals
Latest papers
RVAE-EM: Generative speech dereverberation based on recurrent variational auto-encoder and convolutive transfer function
In this work, we propose a generative dereverberation method.
StoRM: A Diffusion-based Stochastic Regeneration Model for Speech Enhancement and Dereverberation
As diffusion models are generative approaches they may also produce vocalizing and breathing artifacts in adverse conditions.
DeFT-AN: Dense Frequency-Time Attentive Network for Multichannel Speech Enhancement
In this study, we propose a dense frequency-time attentive network (DeFT-AN) for multichannel speech enhancement.
Analysing Diffusion-based Generative Approaches versus Discriminative Approaches for Speech Restoration
In this paper, we systematically compare the performance of generative diffusion models and discriminative approaches on different speech restoration tasks.
Deformable Temporal Convolutional Networks for Monaural Noisy Reverberant Speech Separation
In this work deformable convolution is proposed as a solution to allow TCN models to have dynamic RFs that can adapt to various reverberation times for reverberant speech separation.
Speech Dereverberation with a Reverberation Time Shortening Target
The proposed RTS target suppresses reverberation and meanwhile maintains the exponential decaying property of reverberation, which will ease the network training, and thus reduce signal distortion caused by the prediction error.
Speech Enhancement and Dereverberation with Diffusion-based Generative Models
This matches our forward process which moves from clean speech to noisy speech by including a drift term.
MESH2IR: Neural Acoustic Impulse Response Generator for Complex 3D Scenes
We show that the acoustic metrics of the IRs predicted from our MESH2IR match the ground truth with less than 10% error.
Utterance Weighted Multi-Dilation Temporal Convolutional Networks for Monaural Speech Dereverberation
It is shown that this weighted multi-dilation temporal convolutional network (WD-TCN) consistently outperforms the TCN across various model configurations and using the WD-TCN model is a more parameter efficient method to improve the performance of the model than increasing the number of convolutional blocks.
Receptive Field Analysis of Temporal Convolutional Networks for Monaural Speech Dereverberation
A feature of TCNs is that they have a receptive field (RF) dependent on the specific model configuration which determines the number of input frames that can be observed to produce an individual output frame.