no code implementations • LREC 2022 • Nadja Schauffler, Toni Bernhart, Andre Blessing, Gunilla Eschenbach, Markus Gärtner, Kerstin Jung, Anna Kinder, Julia Koch, Sandra Richter, Gabriel Viehhauser, Ngoc Thang Vu, Lorenz Wesemann, Jonas Kuhn
We present the steps taken towards an exploration platform for a multi-modal corpus of German lyric poetry from the Romantic era developed in the project »textklang«.
1 code implementation • MSR (COLING) 2020 • Xiang Yu, Simon Tannert, Ngoc Thang Vu, Jonas Kuhn
We introduce the IMS contribution to the Surface Realization Shared Task 2020.
no code implementations • ACL (IWPT) 2021 • Stefan Grünewald, Annemarie Friedrich, Jonas Kuhn
We find that the choice of pre-trained embeddings has by far the greatest impact on parser performance and identify XLM-R as a robust choice across the languages in our study.
no code implementations • LREC 2022 • Lukas Wertz, Katsiaryna Mirylenka, Jonas Kuhn, Jasmina Bogojeska
Large scale, multi-label text datasets with high numbers of different classes are expensive to annotate, even more so if they deal with domain specific language.
no code implementations • Findings (ACL) 2022 • Erenay Dayanik, Andre Blessing, Nico Blokker, Sebastian Haunss, Jonas Kuhn, Gabriella Lapesa, Sebastian Pado
Many tasks in text-based computational social science (CSS) involve the classification of political statements into categories based on a domain-specific codebook.
no code implementations • ACL (spnlp) 2021 • Erenay Dayanik, Andre Blessing, Nico Blokker, Sebastian Haunss, Jonas Kuhn, Gabriella Lapesa, Sebastian Padó
The analysis of public debates crucially requires the classification of political demands according to hierarchical claim ontologies (e. g. for immigration, a supercategory “Controlling Migration” might have subcategories “Asylum limit” or “Border installations”).
1 code implementation • UDW (COLING) 2020 • Tillmann Dönicke, Xiang Yu, Jonas Kuhn
The Universal Dependencies treebanks are a still-growing collection of treebanks for a wide range of languages, all annotated with a common inventory of dependency relations.
no code implementations • 21 Nov 2023 • Dominik Schlechtweg, Shafqat Mumtaz Virk, Pauline Sander, Emma Sköldberg, Lukas Theuer Linke, Tuo Zhang, Nina Tahmasebi, Jonas Kuhn, Sabine Schulte im Walde
We present the DURel tool that implements the annotation of semantic proximity between uses of words into an online, open source interface.
no code implementations • 11 Jul 2022 • Julia Koch, Florian Lux, Nadja Schauffler, Toni Bernhart, Felix Dieterle, Jonas Kuhn, Sandra Richter, Gabriel Viehhauser, Ngoc Thang Vu
Speech synthesis for poetry is challenging due to specific intonation patterns inherent to poetic speech.
no code implementations • 19 Nov 2021 • Nico Blokker, André Blessing, Erenay Dayanik, Jonas Kuhn, Sebastian Padó, Gabriella Lapesa
Besides the released resources and the case-study, our contribution is also methodological: we talk the reader through the steps from a newspaper article to a discourse network, demonstrating that there is not just one discourse network for the German migration debate, but multiple ones, depending on the topic of interest (political actors, policy fields, time spans).
1 code implementation • CoNLL (EMNLP) 2021 • Elizaveta Sineva, Stefan Grünewald, Annemarie Friedrich, Jonas Kuhn
In this paper, we revisit the task of negation resolution, which includes the subtasks of cue detection (e. g. "not", "never") and scope resolution.
no code implementations • Joint Conference on Lexical and Computational Semantics 2021 • Anna H{\"a}tty, Julia Bettinger, Michael Dorna, Jonas Kuhn, Sabine Schulte im Walde
Predicting the difficulty of domain-specific vocabulary is an important task towards a better understanding of a domain, and to enhance the communication between lay people and experts.
1 code implementation • Joint Conference on Lexical and Computational Semantics 2021 • Dominik Schlechtweg, Enrique Castaneda, Jonas Kuhn, Sabine Schulte im Walde
We suggest to model human-annotated Word Usage Graphs capturing fine-grained semantic proximity distinctions between word uses with a Bayesian formulation of the Weighted Stochastic Block Model, a generative model for random graphs popular in biology, physics and social sciences.
1 code implementation • ACL 2021 • Sinan Kurtyigit, Maike Park, Dominik Schlechtweg, Jonas Kuhn, Sabine Schulte im Walde
While there is a large amount of research in the field of Lexical Semantic Change Detection, only few approaches go beyond a standard benchmark evaluation of existing models.
no code implementations • EACL 2021 • Severin Laicher, Sinan Kurtyigit, Dominik Schlechtweg, Jonas Kuhn, Sabine Schulte im Walde
Type- and token-based embedding architectures are still competing in lexical semantic change detection.
no code implementations • COLING 2020 • Tillmann D{\"o}nicke, Xiang Yu, Jonas Kuhn
This paper proposes a framework for the expression of typological statements which uses real-valued logics to capture the empirical truth value (truth degree) of a formula on a given data source, e. g. a collection of multilingual treebanks with comparable annotation.
2 code implementations • 23 Oct 2020 • Stefan Grünewald, Annemarie Friedrich, Jonas Kuhn
We find that the choice of pre-trained embeddings has by far the greatest impact on parser performance and identify XLM-R as a robust choice across the languages in our study.
no code implementations • ACL 2020 • Xiang Yu, Simon Tannert, Ngoc Thang Vu, Jonas Kuhn
We propose a graph-based method to tackle the dependency tree linearization task.
no code implementations • WS 2020 • Xiang Yu, Ngoc Thang Vu, Jonas Kuhn
We present an iterative data augmentation framework, which trains and searches for an optimal ensemble and simultaneously annotates new training data in a self-training style.
no code implementations • WS 2020 • Agnieszka Falenska, Anders Bj{\"o}rkelund, Jonas Kuhn
Graph-based and transition-based dependency parsers used to have different strengths and weaknesses.
no code implementations • LREC 2020 • Gabriella Lapesa, Andre Blessing, Nico Blokker, Erenay Dayanik, Sebastian Haunss, Jonas Kuhn, Sebastian Pad{\'o}
DEbateNet-migr15 is a manually annotated dataset for German which covers the public debate on immigration in 2015.
no code implementations • LREC 2020 • Reem Alatrash, Dominik Schlechtweg, Jonas Kuhn, Sabine Schulte im Walde
Modelling language change is an increasingly important area of interest within the fields of sociolinguistics and historical linguistics.
no code implementations • LREC 2020 • Agnieszka Falenska, Zolt{\'a}n Czesznak, Kerstin Jung, Moritz V{\"o}lkel, Wolfgang Seeker, Jonas Kuhn
The dataset extends an existing corpus GRAIN and comes with constituency and dependency trees for six interviews.
no code implementations • WS 2019 • Xiang Yu, Agnieszka Falenska, Marina Haid, Ngoc Thang Vu, Jonas Kuhn
We introduce the IMS contribution to the Surface Realization Shared Task 2019.
no code implementations • WS 2019 • Xiang Yu, Agnieszka Falenska, Ngoc Thang Vu, Jonas Kuhn
We present a dependency tree linearization model with two novel components: (1) a tree-structured encoder based on bidirectional Tree-LSTM that propagates information first bottom-up then top-down, which allows each token to access information from the entire tree; and (2) a linguistically motivated head-first decoder that emphasizes the central role of the head and linearizes the subtree by incrementally attaching the dependents on both sides of the head.
no code implementations • WS 2019 • Xiang Yu, Ngoc Thang Vu, Jonas Kuhn
The generalized Dyck language has been used to analyze the ability of Recurrent Neural Networks (RNNs) to learn context-free grammars (CFGs).
1 code implementation • ACL 2019 • Sebastian Pad{\'o}, Andre Blessing, Nico Blokker, Erenay Dayanik, Sebastian Haunss, Jonas Kuhn
Understanding the structures of political debates (which actors make what claims) is essential for understanding democratic political decision making.
no code implementations • ACL 2019 • Andre Blessing, Nico Blokker, Sebastian Haunss, Jonas Kuhn, Gabriella Lapesa, Sebastian Pad{\'o}
This paper describes the MARDY corpus annotation environment developed for a collaboration between political science and computational linguistics.
no code implementations • ACL 2019 • Agnieszka Falenska, Jonas Kuhn
Classical non-neural dependency parsers put considerable effort on the design of feature functions.
no code implementations • 7 Nov 2018 • Agnieszka Falenska, Anders Björkelund, Xiang Yu, Jonas Kuhn
In this paper we show which components of the system were the most responsible for its final performance.
no code implementations • WS 2018 • Xiang Yu, Ngoc Thang Vu, Jonas Kuhn
We present a general approach with reinforcement learning (RL) to approximate dynamic oracles for transition systems where exact dynamic oracles are difficult to derive.
1 code implementation • COLING 2018 • Markus G{\"a}rtner, Sven Mayer, Valentin Schwind, Eric H{\"a}mmerle, Emine Turcan, Florin Rheinwald, Gustav Murawski, Lars Lischke, Jonas Kuhn
The application combines supporting text annotation and enriching the text with additional information from a number of sources directly within the application.
no code implementations • COLING 2018 • Ina Roesiger, Arndt Riester, Jonas Kuhn
Recent work on bridging resolution has so far been based on the corpus ISNotes (Markert et al. 2012), as this was the only corpus available with unrestricted bridging annotation.
no code implementations • COLING 2018 • Thomas Haider, Jonas Kuhn
We present the first supervised approach to rhyme detection with Siamese Recurrent Networks (SRN) that offer near perfect performance (97{\%} accuracy) with a single model on rhyme pairs for German, English and French, allowing future large scale analyses.
2 code implementations • NAACL 2018 • Kyle Richardson, Jonathan Berant, Jonas Kuhn
Traditional approaches to semantic parsing (SP) work by training individual models for each available parallel dataset of text-meaning pairs.
no code implementations • WS 2017 • Kyle Richardson, Sina Zarrie{\ss}, Jonas Kuhn
We propose a new shared task for tactical data-to-text generation in the domain of source code libraries.
no code implementations • EMNLP 2017 • Sarah Schulz, Jonas Kuhn
One of the main obstacles for many Digital Humanities projects is the low data availability.
no code implementations • CONLL 2017 • Anders Bj{\"o}rkelund, Agnieszka Falenska, Xiang Yu, Jonas Kuhn
This paper presents the IMS contribution to the CoNLL 2017 Shared Task.
1 code implementation • 31 Jul 2017 • Kyle Richardson, Sina Zarrieß, Jonas Kuhn
We propose a new shared task for tactical data-to-text generation in the domain of source code libraries.
2 code implementations • EMNLP 2017 • Kyle Richardson, Jonas Kuhn
For a given text query and background API, the tool finds candidate functions by performing a translation from the text to known representations in the API using the semantic parsing approach of Richardson and Kuhn (2017).
no code implementations • ACL 2017 • Kyle Richardson, Jonas Kuhn
We consider the problem of translating high-level textual descriptions to formal representations in technical documentation as part of an effort to model the meaning of such documentation.
no code implementations • WS 2016 • Jonas Kuhn
I start this talk by sketching some sample scenarios of Digital Humanities projects which involve various Humanities and Social Science disciplines, noting that the potential for a meaningful contribution to higher-level questions is highest when the employed language technological models are carefully tailored both (a) to characteristics of the given target corpus, and (b) to relevant analytical subtasks feeding the discipline-specific research questions.
no code implementations • COLING 2016 • Andrea Glaser, Jonas Kuhn
We propose an approach to Named Entity Disambiguation that avoids a problem of standard work on the task (likewise affecting fully supervised, weakly supervised, or distantly supervised machine learning techniques): the treatment of name mentions referring to people with no (or very little) coverage in the textual training data is systematically incorrect.
no code implementations • LREC 2016 • Ina Roesiger, Jonas Kuhn
This paper presents a data-driven co-reference resolution system for German that has been adapted from IMS HotCoref, a co-reference resolver for English.
no code implementations • LREC 2016 • Sarah Schulz, Jonas Kuhn
In this paper, we investigate unsupervised and semi-supervised methods for part-of-speech (PoS) tagging in the context of historical German text.
no code implementations • TACL 2016 • Kyle Richardson, Jonas Kuhn
We introduce a new approach to training a semantic parser that uses textual entailment judgements as supervision.
no code implementations • LREC 2014 • Wiltrud Kessler, Jonas Kuhn
For each sentence we have annotated detailed information about the comparisons it contains: The comparative predicate that expresses the comparison, the type of the comparison, the two entities that are being compared, and the aspect they are compared in.
no code implementations • LREC 2014 • Wolfgang Seeker, Jonas Kuhn
We present a dependency conversion of five German test sets from five different genres.
no code implementations • LREC 2014 • Kyle Richardson, Jonas Kuhn
We present a new resource, the UnixMan Corpus, for studying language learning it the domain of Unix utility manuals.
no code implementations • LREC 2014 • Masood Ghayoomi, Jonas Kuhn
In this paper, we introduce an algorithm to convert an HPSG-based treebank into its parallel dependency-based treebank.
no code implementations • LREC 2014 • Andre Blessing, Jonas Kuhn
We present a web-based application which is called TEA (Textual Emigration Analysis) as a showcase that applies textual analysis for the humanities.
no code implementations • LREC 2014 • Andrea Glaser, Jonas Kuhn
In our experiments on real data we obtain comparable results.
no code implementations • LREC 2012 • Patrick Ziering, Sina Zarrie{\ss}, Jonas Kuhn
In this paper, we investigate the usage of a non-canonical German passive alternation for ditransitive verbs, the recipient passive, in naturally occuring corpus data.
no code implementations • LREC 2012 • Wolfgang Seeker, Jonas Kuhn
We present a carefully designed dependency conversion of the German phrase-structure treebank TiGer that explicitly represents verb ellipses by introducing empty nodes into the tree.