1 code implementation • LREC 2020 • Jón Friðrik Daðason, David Erik Mollberg, Hrafn Loftsson, Kristín Bjarnadóttir
In this paper, we present a character-based BiLSTM model for splitting Icelandic compound words, and show how varying amounts of training data affects the performance of the model.
no code implementations • WS (NoDaLiDa) 2019 • Svanhvít Lilja Ingólfsdóttir, Hrafn Loftsson, Jón Friðrik Daðason, Kristín Bjarnadóttir
Lemmatization, finding the basic morphological form of a word in a corpus, is an important step in many natural language processing tasks when working with morphologically rich languages.