Processing Language Resources of Under-Resourced and Endangered Languages for the Generation of Augmentative Alternative Communication Boards

LREC 2020  ·  Anne Ferger ·

Under-resourced and endangered or small languages yield problems for automatic processing and exploiting because of the small amount of available data. This paper shows an approach using different annotations of enriched linguistic research data to create communication boards commonly used in Alternative Augmentative Communication (AAC). Using manually created lexical analysis and rich annotation (instead of high data quantity) allows for an automated creation of AAC communication boards. The example presented in this paper uses data of the indigenous language Dolgan (an endangered Turkic language of Northern Siberia) created in the project INEL(Arkhipov and D{\"a}britz, 2018) to generate a basic communication board with audio snippets to be used in e.g. hospital communication or for multilingual settings. The created boards can be importet into various AAC software. In addition, the usage of standard formats makes this approach applicable to various different use cases.

PDF Abstract
No code implementations yet. Submit your code now

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here