TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Utterance-level pronounciation scoring	speechocean762	HiPAMA-Librispeech	Pearson correlation coefficient (PCC)	0.754	# 2
Word-level pronunciation scoring	speechocean762	HiPAMA-Librispeech	Pearson correlation coefficient (PCC)	0.59	# 3
Phone-level pronunciation scoring	speechocean762	HiPAMA-Librispeech	Pearson correlation coefficient (PCC)	0.62	# 5

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/hierarchical-pronunciation-assessment-with/utterance-level-pronounciation-scoring-on)](https://paperswithcode.com/sota/utterance-level-pronounciation-scoring-on?p=hierarchical-pronunciation-assessment-with)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/hierarchical-pronunciation-assessment-with/word-level-pronunciation-scoring-on)](https://paperswithcode.com/sota/word-level-pronunciation-scoring-on?p=hierarchical-pronunciation-assessment-with)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/hierarchical-pronunciation-assessment-with/phone-level-pronunciation-scoring-on)](https://paperswithcode.com/sota/phone-level-pronunciation-scoring-on?p=hierarchical-pronunciation-assessment-with)`

Hierarchical Pronunciation Assessment with Multi-Aspect Attention

15 Nov 2022 · Heejin Do, Yunsu Kim, Gary Geunbae Lee ·

Automatic pronunciation assessment is a major component of a computer-assisted pronunciation training system. To provide in-depth feedback, scoring pronunciation at various levels of granularity such as phoneme, word, and utterance, with diverse aspects such as accuracy, fluency, and completeness, is essential. However, existing multi-aspect multi-granularity methods simultaneously predict all aspects at all granularity levels; therefore, they have difficulty in capturing the linguistic hierarchy of phoneme, word, and utterance. This limitation further leads to neglecting intimate cross-aspect relations at the same linguistic unit. In this paper, we propose a Hierarchical Pronunciation Assessment with Multi-aspect Attention (HiPAMA) model, which hierarchically represents the granularity levels to directly capture their linguistic structures and introduces multi-aspect attention that reflects associations across aspects at the same level to create more connotative representations. By obtaining relational information from both the granularity- and aspect-side, HiPAMA can take full advantage of multi-task learning. Remarkable improvements in the experimental results on the speachocean762 datasets demonstrate the robustness of HiPAMA, particularly in the difficult-to-assess aspects.

PDF Abstract

Code

Add Remove Mark official

doheejin/HiPAMA official

Tasks

Add Remove

Multi-Task Learning

Phone-level pronunciation scoring

Utterance-level pronounciation scoring

Word-level pronunciation scoring

Datasets

speechocean762

Results from the Paper

Edit

Ranked #2 on Utterance-level pronounciation scoring on speechocean762

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Utterance-level pronounciation scoring	speechocean762	HiPAMA-Librispeech	Pearson correlation coefficient (PCC)	0.754	# 2	Compare
Word-level pronunciation scoring	speechocean762	HiPAMA-Librispeech	Pearson correlation coefficient (PCC)	0.59	# 3	Compare
Phone-level pronunciation scoring	speechocean762	HiPAMA-Librispeech	Pearson correlation coefficient (PCC)	0.62	# 5	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

Hierarchical Pronunciation Assessment with Multi-Aspect Attention

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove