RSS-TOBI - A Prosodically Enhanced Romanian Speech Corpus

LREC 2014 · Tiberiu Boro{\textcommabelow{s}}, Adriana Stan, Oliver Watts, Stefan Daniel Dumitrescu ·

This paper introduces a recent development of a Romanian Speech corpus to include prosodic annotations of the speech data in the form of ToBI labels. We describe the methodology of determining the required pitch patterns that are common for the Romanian language, annotate the speech resource, and then provide a comparison of two text-to-speech synthesis systems to establish the benefits of using this type of information to our speech resource. The result is a publicly available speech dataset which can be used to further develop speech synthesis systems or to automatically learn the prediction of ToBI labels from text in Romanian language.

PDF Abstract