no code implementations • 29 Oct 2022 • Ebbie Awino, Lilian Wanzare, Lawrence Muchemi, Barack Wanjawa, Edward Ombui, Florence Indede, Owen McOnyango, Benard Okal
Building automatic speech recognition (ASR) systems is a challenging task, especially for under-resourced languages that need to construct corpora nearly from scratch and lack sufficient training data.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 25 Aug 2022 • Barack Wanjawa, Lilian Wanzare, Florence Indede, Owen McOnyango, Edward Ombui, Lawrence Muchemi
The Kencorpus dataset is a text and speech corpus for three languages predominantly spoken in Kenya: Swahili, Dholuo and Luhya.
no code implementations • WS 2017 • Lilian Wanzare, Aless Zarcone, ra, Stefan Thater, Manfred Pinkal
We present a semi-supervised clustering approach to induce script structure from crowdsourced descriptions of event sequences by grouping event descriptions into paraphrase sets (representing event types) and inducing their temporal order.