no code implementations • COLING (LaTeCHCLfL, CLFL, LaTeCH) 2020 • Bernhard Liebl, Manuel Burghardt
In this paper we describe an approach for the computer-aided identification of Shakespearean intertextuality in a corpus of contemporary fiction.
no code implementations • 6 Aug 2020 • Bernhard Liebl, Manuel Burghardt
We investigate how to train a high quality optical character recognition (OCR) model for difficult historical typefaces on degraded paper.
1 code implementation • 15 Apr 2020 • Bernhard Liebl, Manuel Burghardt
One important and particularly challenging step in the optical character recognition (OCR) of historical documents with complex layouts, such as newspapers, is the separation of text from non-text content (e. g. page borders or illustrations).
Optical Character Recognition Optical Character Recognition (OCR)