Speech Enhancement

218 papers with code • 12 benchmarks • 19 datasets

Speech Enhancement is a signal processing task that involves improving the quality of speech signals captured under noisy or degraded conditions. The goal of speech enhancement is to make speech signals clearer, more intelligible, and more pleasant to listen to, which can be used for various applications such as voice recognition, teleconferencing, and hearing aids.

( Image credit: A Fully Convolutional Neural Network For Speech Enhancement )

Libraries

Use these libraries to find Speech Enhancement models and implementations
4 papers
485
3 papers
7,892
See all 10 libraries.

Latest papers with no code

Mel-FullSubNet: Mel-Spectrogram Enhancement for Improving Both Speech Quality and ASR

no code yet • 21 Feb 2024

In this work, we propose Mel-FullSubNet, a single-channel Mel-spectrogram denoising and dereverberation network for improving both speech quality and automatic speech recognition (ASR) performance.

Plugin Speech Enhancement: A Universal Speech Enhancement Framework Inspired by Dynamic Neural Network

no code yet • 20 Feb 2024

In this study, we present a novel weighting prediction approach, which explicitly learns the task relationships from downstream training information to address the core challenge of universal speech enhancement.

SECP: A Speech Enhancement-Based Curation Pipeline For Scalable Acquisition Of Clean Speech

no code yet • 19 Feb 2024

In this paper, we address this issue by outlining Speech Enhancement-based Curation Pipeline (SECP) which serves as a framework to onboard clean speech.

Speaking in Wavelet Domain: A Simple and Efficient Approach to Speed up Speech Diffusion Model

no code yet • 16 Feb 2024

Recently, Denoising Diffusion Probabilistic Models (DDPMs) have attained leading performances across a diverse range of generative tasks.

Diffusion Models for Audio Restoration

no code yet • 15 Feb 2024

Here, we aim to show that diffusion models can combine the best of both worlds and offer the opportunity to design audio restoration algorithms with a good degree of interpretability and a remarkable performance in terms of sound quality.

Overview of the L3DAS23 Challenge on Audio-Visual Extended Reality

no code yet • 14 Feb 2024

The primary goal of the L3DAS23 Signal Processing Grand Challenge at ICASSP 2023 is to promote and support collaborative research on machine learning for 3D audio signal processing, with a specific emphasis on 3D speech enhancement and 3D Sound Event Localization and Detection in Extended Reality applications.

Unrestricted Global Phase Bias-Aware Single-channel Speech Enhancement with Conformer-based Metric GAN

no code yet • 13 Feb 2024

With the rapid development of neural networks in recent years, the ability of various networks to enhance the magnitude spectrum of noisy speech in the single-channel speech enhancement domain has become exceptionally outstanding.

Objective and subjective evaluation of speech enhancement methods in the UDASE task of the 7th CHiME challenge

no code yet • 2 Feb 2024

In this paper, we present the objective and subjective evaluations of the systems that were submitted to the CHiME-7 UDASE task, and we provide an analysis of the results.

An Analysis of the Variance of Diffusion-based Speech Enhancement

no code yet • 1 Feb 2024

The speech enhancement performance varies depending on the choice of the stochastic differential equation that controls the evolution of the mean and the variance along the diffusion processes when adding environmental and Gaussian noise.

Real-time Stereo Speech Enhancement with Spatial-Cue Preservation based on Dual-Path Structure

no code yet • 1 Feb 2024

The algorithm runs in real-time on 10-ms frames with a 40 ms of look-ahead.