Search Results for author: Frank Puppe

Found 15 papers, 7 papers with code

The FairyNet Corpus - Character Networks for German Fairy Tales

no code implementations • EMNLP (LaTeCHCLfL, CLFL, LaTeCH) 2021 • David Schmidt, Albin Zehe, Janne Lorenzen, Lisa Sergel, Sebastian Düker, Markus Krug, Frank Puppe

The release of this corpus provides an opportunity of training and comparing different algorithms for the extraction of character networks, which so far was barely possible due to heterogeneous interests of previous researchers.

Paper
Add Code

Detecting Scenes in Fiction: A new Segmentation Task

no code implementations • EACL 2021 • Albin Zehe, Leonard Konle, Lea Katharina D{\"u}mpelmann, Evelyn Gius, Andreas Hotho, Fotis Jannidis, Lucas Kaufmann, Markus Krug, Frank Puppe, Nils Reiter, Annekea Schreiber, Nathalie Wiedmer

This paper introduces the novel task of scene segmentation on narrative texts and provides an annotated corpus, a discussion of the linguistic and narrative properties of the task and baseline experiments towards automatic solutions.

coreference-resolution Scene Segmentation +1

Paper
Add Code

OCR4all -- An Open-Source Tool Providing a (Semi-)Automatic OCR Workflow for Historical Printings

no code implementations • 9 Sep 2019 • Christian Reul, Dennis Christ, Alexander Hartelt, Nico Balbach, Maximilian Wehner, Uwe Springmann, Christoph Wick, Christine Grundig, Andreas Büttner, Frank Puppe

Nevertheless, in the last few years great progress has been made in the area of historical OCR, resulting in several powerful open-source tools for preprocessing, layout recognition and segmentation, character recognition and post-processing.

Optical Character Recognition Optical Character Recognition (OCR)

Paper
Add Code

State of the Art Optical Character Recognition of 19th Century Fraktur Scripts using Open Source Engines

1 code implementation • 8 Oct 2018 • Christian Reul, Uwe Springmann, Christoph Wick, Frank Puppe

In this paper we evaluate Optical Character Recognition (OCR) of 19th century Fraktur scripts without book-specific training using mixed models, i. e. models trained to recognize a variety of fonts and typesets from previously unseen sources.

Optical Character Recognition Optical Character Recognition (OCR)

Paper
Code

Calamari - A High-Performance Tensorflow-based Deep Learning Package for Optical Character Recognition

1 code implementation • 5 Jul 2018 • Christoph Wick, Christian Reul, Frank Puppe

Optical Character Recognition (OCR) on contemporary and historical data is still in the focus of many researchers.

Optical Character Recognition Optical Character Recognition (OCR)

Paper
Code

Improving OCR Accuracy on Early Printed Books using Deep Convolutional Networks

1 code implementation • 27 Feb 2018 • Christoph Wick, Christian Reul, Frank Puppe

This paper proposes a combination of a convolutional and a LSTM network to improve the accuracy of OCR on early printed books.

Optical Character Recognition (OCR)

Paper
Code

Improving OCR Accuracy on Early Printed Books by combining Pretraining, Voting, and Active Learning

1 code implementation • 27 Feb 2018 • Christian Reul, Uwe Springmann, Christoph Wick, Frank Puppe

We combine three methods which significantly improve the OCR accuracy of OCR models trained on early printed books: (1) The pretraining method utilizes the information stored in already existing models trained on a variety of typesets (mixed models) instead of starting the training from scratch.

Active Learning

Paper
Code

Transfer Learning for OCRopus Model Training on Early Printed Books

1 code implementation • 15 Dec 2017 • Christian Reul, Christoph Wick, Uwe Springmann, Frank Puppe

The evaluation on seven early printed books showed that training from the Latin mixed model reduces the average amount of errors by 43% and 26%, respectively compared to training from scratch with 60 and 150 lines of ground truth, respectively.

Optical Character Recognition (OCR) Transfer Learning

Paper
Code

Leaf Identification Using a Deep Convolutional Neural Network

no code implementations • 4 Dec 2017 • Christoph Wick, Frank Puppe

Convolutional neural networks (CNNs) have become popular especially in computer vision in the last few years because they achieved outstanding performance on different tasks, such as image classifications.

Data Augmentation General Classification +1

Paper
Add Code

Improving OCR Accuracy on Early Printed Books by utilizing Cross Fold Training and Voting

1 code implementation • 27 Nov 2017 • Christian Reul, Uwe Springmann, Christoph Wick, Frank Puppe

Experiments on seven early printed books show that the proposed method outperforms the standard approach considerably by reducing the amount of errors by up to 50% and more.

Optical Character Recognition (OCR)