Texts

Helsinki Prosody Corpus

Introduced by Talman et al. in Predicting Prosodic Prominence from Text with Pre-trained Contextualized Word Representations

The Helsinki Prosody Corpus is a dataset for predicting prosodic prominence from written text. The prosodic annotations are automatically generated, high quality prosodic for the 'clean' subsets of LibriTTS corpus (Zen et al., 2019), comprising of 262.5 hours of read speech from 1230 speakers. The transcribed sentences were aligned and then prosodically annotated with word-level acoustic prominence labels.

Source: Predicting Prosodic Prominence from Text with Pre-trained Contextualized Word Representations

Homepage