TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Voice Conversion	LibriSpeech test-clean	kNN-VC (prematched HiFiGAN)	Word Error Rate (WER)	7.36	# 1
Voice Conversion	LibriSpeech test-clean	kNN-VC (prematched HiFiGAN)	Equal Error Rate	37.15	# 1
Voice Conversion	LibriSpeech test-clean	kNN-VC (prematched HiFiGAN)	Character Error Rate (CER)	2.96	# 1

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/voice-conversion-with-just-nearest-neighbors/voice-conversion-on-librispeech-test-clean)](https://paperswithcode.com/sota/voice-conversion-on-librispeech-test-clean?p=voice-conversion-with-just-nearest-neighbors)`

Voice Conversion With Just Nearest Neighbors

30 May 2023 · Matthew Baas, Benjamin van Niekerk, Herman Kamper ·

Any-to-any voice conversion aims to transform source speech into a target voice with just a few examples of the target speaker as a reference. Recent methods produce convincing conversions, but at the cost of increased complexity -- making results difficult to reproduce and build on. Instead, we keep it simple. We propose k-nearest neighbors voice conversion (kNN-VC): a straightforward yet effective method for any-to-any conversion. First, we extract self-supervised representations of the source and reference speech. To convert to the target speaker, we replace each frame of the source representation with its nearest neighbor in the reference. Finally, a pretrained vocoder synthesizes audio from the converted representation. Objective and subjective evaluations show that kNN-VC improves speaker similarity with similar intelligibility scores to existing methods. Code, samples, trained models: https://bshall.github.io/knn-vc

PDF Abstract

Code

Add Remove Mark official

bshall/knn-vc official

↳ Quickstart in

Colab

410

Tasks

Add Remove

Voice Conversion

Datasets

LibriSpeech

Results from the Paper

Edit

Ranked #1 on Voice Conversion on LibriSpeech test-clean (using extra training data)

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Voice Conversion	LibriSpeech test-clean	kNN-VC (prematched HiFiGAN)	Word Error Rate (WER)	7.36	# 1	Compare
			Equal Error Rate	37.15	# 1	Compare
			Character Error Rate (CER)	2.96	# 1	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

Voice Conversion With Just Nearest Neighbors

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove