no code implementations • RANLP (BUCC) 2021 • Ben Burtenshaw, Mike Kestemont
Multi-label toxicity detection is highly prominent, with many research groups, companies, and individuals engaging with it through shared tasks and dedicated venues.
Multi Label Text Classification Multi-Label Text Classification +1
no code implementations • EMNLP (LaTeCHCLfL, CLFL, LaTeCH) 2021 • Enrique Manjavacas Arevalo, Laurence Mellerin, Mike Kestemont
We report on an inter-annotator agreement experiment involving instances of text reuse focusing on the well-known case of biblical intertextuality in medieval literature.
no code implementations • COLING (LaTeCHCLfL, CLFL, LaTeCH) 2020 • Nikolay Banar, Walter Daelemans, Mike Kestemont
We investigate the use of Iconclass in the context of neural machine translation for NL<->EN artwork titles.
1 code implementation • 25 Oct 2022 • Wouter Haverals, Mike Kestemont
This study is devoted to two of the oldest known manuscripts in which the oeuvre of the medieval mystical author Hadewijch has been preserved: Brussels, KBR, 2879-2880 (ms. A) and Brussels, KBR, 2877-2878 (ms. B).
no code implementations • SEMEVAL 2021 • Ben Burtenshaw, Mike Kestemont
This paper describes the system developed by the Antwerp Centre for Digital humanities and literary Criticism [UAntwerp] for toxic span detection.
no code implementations • TimeMachine RFC 2031 • Frédéric Kaplan, Kevin Baumer, Mike Kestemont, Daniel Jeller
Reaching consensus on the technology options to pursue in a programme as large as Time Machine is a complex issue.
no code implementations • 22 May 2020 • Nikolay Banar, Walter Daelemans, Mike Kestemont
To stimulate further research in this area and close the gap with subword-level NMT, we make all our code and models publicly available.
no code implementations • 11 May 2020 • Matthia Sabatelli, Mike Kestemont, Pierre Geurts
We study the generalization properties of pruned neural networks that are the winners of the lottery ticket hypothesis on datasets of natural images.
no code implementations • LREC 2020 • Joanna Byszuk, Micha{\l} Wo{\'z}niak, Mike Kestemont, Albert Le{\'s}niak, Wojciech {\L}ukasik, Artjoms {\v{S}}e{\c{l}}a, Maciej Eder
Fictional prose can be broadly divided into narrative and discursive forms with direct speech being central to any discourse representation (alongside indirect reported speech and free indirect discourse).
no code implementations • WS 2019 • Enrique Manjavacas, Mike Kestemont, Folgert Karsdorp
This paper addresses Hip-Hop lyric generation with conditional Neural Language Models.
no code implementations • WS 2019 • Enrique Manjavacas, Brian Long, Mike Kestemont
The detection of allusive text reuse is particularly challenging due to the sparse evidence on which allusive references rely---commonly based on none or very few shared words.
2 code implementations • NAACL 2019 • Enrique Manjavacas, Ákos Kádár, Mike Kestemont
Lemmatization of standard languages is concerned with (i) abstracting over morphological differences and (ii) resolving token-lemma ambiguities of inflected words in order to map them to a dictionary headword.
no code implementations • WS 2017 • Enrique Manjavacas, Jeroen De Gussem, Walter Daelemans, Mike Kestemont
Recent applications of neural language models have led to an increased interest in the automatic generation of natural language.
1 code implementation • 4 Mar 2016 • Mike Kestemont, Jeroen De Gussem
In this paper we consider two sequence tagging tasks for medieval Latin: part-of-speech tagging and lemmatization.
no code implementations • LREC 2012 • Mike Kestemont, Claudia Peersman, Benny De Decker, Guy De Pauw, Kim Luyckx, Roser Morante, Frederik Vaassen, Janneke van de Loo, Walter Daelemans
Although in recent years numerous forms of Internet communication ― such as e-mail, blogs, chat rooms and social network environments ― have emerged, balanced corpora of Internet speech with trustworthy meta-information (e. g. age and gender) or linguistic annotations are still limited.