no code implementations • 18 Nov 2022 • Kaspar Beelen, Daniel van Strien
This paper discusses the benefits of including metadata when training language models on historical collections.
1 code implementation • 30 Nov 2021 • Kasra Hosseini, Daniel C. S. Wilson, Kaspar Beelen, Katherine McDonough
We present MapReader, a free, open-source software library written in Python for analyzing large map collections (scanned or born-digital).
2 code implementations • 24 May 2021 • Kasra Hosseini, Kaspar Beelen, Giovanni Colavizza, Mariona Coll Ardanuy
We present four types of neural language models trained on a large historical dataset of books in English, published between 1760-1900 and comprised of ~5. 1 billion tokens.
1 code implementation • COLING 2020 • Mariona Coll Ardanuy, Federico Nanni, Kaspar Beelen, Kasra Hosseini, Ruth Ahnert, Jon Lawrence, Katherine McDonough, Giorgia Tolfo, Daniel CS Wilson, Barbara McGillivray
This paper proposes a new approach to animacy detection, the task of determining whether an entity is represented as animate in a text.
1 code implementation • 15 Nov 2017 • Hosein Azarbonyad, Mostafa Dehghani, Kaspar Beelen, Alexandra Arkut, Maarten Marx, Jaap Kamps
We propose an approach for detecting semantic shifts between different viewpoints--broadly defined as a set of texts that share a specific metadata feature, which can be a time-period, but also a social entity such as a political party.