TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Speech Enhancement	VoiceBank + DEMAND	FSPEN	PESQ	2.97	# 17
Speech Enhancement	VoiceBank + DEMAND	FSPEN	STOI	0.942	# 10
Speech Enhancement	VoiceBank + DEMAND	FSPEN	Para. (M)	0.079	# 1

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/fspen-an-ultra-lightweight-network-for-real/speech-enhancement-on-demand)](https://paperswithcode.com/sota/speech-enhancement-on-demand?p=fspen-an-ultra-lightweight-network-for-real)`

FSPEN: AN ULTRA-LIGHTWEIGHT NETWORK FOR REAL TIME SPEECH ENAHNCMENT

Conference 2024 · Lei Yang1, Wei Liu1, Ruijie Meng1, Gunwoo Lee2, Soonho Baek2, Han-gil Moon2 ·

Deep learning-based speech enhancement methods have shown promising result in recent years. However, in practical applications, the model size and computational complexity are important factors that limit their use in end-products. Therefore, in products that require real-time speech enhancement with limited resources, such as TWS headsets, hearing aids, IoT devices, etc., ultra-lightweight models are necessary. In this paper, an ultra-lightweight network FSPEN is proposed for real-time speech enhancement task. We propose a full-band and sub-band network structure for extracting global and local features, and an inter-frame path extension method that can enhance network modeling capacity while preserving complexity. Experiments demonstrate that the proposed FSPEN achieves a performance of PESQ 2.97 on the VoiceBank+Demand dataset at 89M multiply-accumulate operation per second (MAC) and 79k parameters.

PDF