Search Results for author: Steffen Eger

Found 78 papers, 50 papers with code

End-to-end style-conditioned poetry generation: What does it take to learn from examples alone?

no code implementations EMNLP (LaTeCHCLfL, CLFL, LaTeCH) 2021 Jörg Wöckener, Thomas Haider, Tristan Miller, The-Khang Nguyen, Thanh Tung Linh Nguyen, Minh Vu Pham, Jonas Belouadi, Steffen Eger

In this work, we design an end-to-end model for poetry generation based on conditioned recurrent neural network (RNN) language models whose goal is to learn stylistic features (poem length, sentiment, alliteration, and rhyming) from examples alone.

Evaluation of Coreference Resolution Systems Under Adversarial Attacks

no code implementations EMNLP (CODI) 2020 Haixia Chai, Wei Zhao, Steffen Eger, Michael Strube

A substantial overlap of coreferent mentions in the CoNLL dataset magnifies the recent progress on coreference resolution.

coreference-resolution

TUDA-Reproducibility @ ReproGen: Replicability of Human Evaluation of Text-to-Text and Concept-to-Text Generation

no code implementations INLG (ACL) 2021 Christian Richter, Yanran Chen, Steffen Eger

This paper describes our contribution to the Shared Task ReproGen by Belz et al. (2021), which investigates the reproducibility of human evaluations in the context of Natural Language Generation.

Concept-To-Text Generation Paper generation

TUDa at WMT21: Sentence-Level Direct Assessment with Adapters

no code implementations WMT (EMNLP) 2021 Gregor Geigle, Jonas Stadtmüller, Wei Zhao, Jonas Pfeiffer, Steffen Eger

This paper presents our submissions to the WMT2021 Shared Task on Quality Estimation, Task 1 Sentence-Level Direct Assessment.

Sentence

Syntactic Language Change in English and German: Metrics, Parsers, and Convergences

1 code implementation18 Feb 2024 Yanran Chen, Wei Zhao, Anne Breitbarth, Manuel Stoeckel, Alexander Mehler, Steffen Eger

Even though we have evidence that recent parsers trained on modern treebanks are not heavily affected by data 'noise' such as spelling changes and OCR errors in our historic data, we find that results of syntactic language change are sensitive to the parsers involved, which is a caution against using a single parser for evaluating syntactic language change as done in previous work.

Optical Character Recognition (OCR) Sentence

Is there really a Citation Age Bias in NLP?

no code implementations7 Jan 2024 Hoa Nguyen, Steffen Eger

Recently, it has been noted that there is a citation age bias in the Natural Language Processing (NLP) community, one of the currently fastest growing AI subfields, in that the mean age of the bibliography of NLP papers has become ever younger in the last few years, leading to `citation amnesia' in which older knowledge is increasingly forgotten.

NLLG Quarterly arXiv Report 09/23: What are the most influential current AI Papers?

1 code implementation9 Dec 2023 Ran Zhang, Aida Kostikova, Christoph Leiter, Jonas Belouadi, Daniil Larionov, Yanran Chen, Vivian Fresen, Steffen Eger

Artificial Intelligence (AI) has witnessed rapid growth, especially in the subfields Natural Language Processing (NLP), Machine Learning (ML) and Computer Vision (CV).

Navigate

The Eval4NLP 2023 Shared Task on Prompting Large Language Models as Explainable Metrics

1 code implementation30 Oct 2023 Christoph Leiter, Juri Opitz, Daniel Deutsch, Yang Gao, Rotem Dror, Steffen Eger

Specifically, we propose a novel competition setting in which we select a list of allowed LLMs and disallow fine-tuning to ensure a focus on prompting.

Machine Translation Text Generation

AutomaTikZ: Text-Guided Synthesis of Scientific Vector Graphics with TikZ

1 code implementation30 Sep 2023 Jonas Belouadi, Anne Lauscher, Steffen Eger

To address this, we propose the use of TikZ, a well-known abstract graphics language that can be compiled to vector graphics, as an intermediate representation of scientific figures.

Language Modelling Large Language Model +2

NLLG Quarterly arXiv Report 06/23: What are the most influential current AI Papers?

1 code implementation31 Jul 2023 Steffen Eger, Christoph Leiter, Jonas Belouadi, Ran Zhang, Aida Kostikova, Daniil Larionov, Yanran Chen, Vivian Fresen

In particular, we compile a list of the 40 most popular papers based on normalized citation counts from the first half of 2023.

Cross-lingual Cross-temporal Summarization: Dataset, Models, Evaluation

1 code implementation22 Jun 2023 Ran Zhang, Jihed Ouni, Steffen Eger

Additionally, we explore the potential of ChatGPT for CLCTS as a summarizer and an evaluator.

Negation

Towards Explainable Evaluation Metrics for Machine Translation

no code implementations22 Jun 2023 Christoph Leiter, Piyawat Lertvittayakumjorn, Marina Fomicheva, Wei Zhao, Yang Gao, Steffen Eger

In this context, we also discuss the latest state-of-the-art approaches to explainable metrics based on generative models such as ChatGPT and GPT4.

Machine Translation Translation

Cross-Genre Argument Mining: Can Language Models Automatically Fill in Missing Discourse Markers?

no code implementations7 Jun 2023 Gil Rocha, Henrique Lopes Cardoso, Jonas Belouadi, Steffen Eger

We demonstrate the impact of our approach on an Argument Mining downstream task, evaluated on different corpora, showing that language models can be trained to automatically fill in discourse markers across different corpora, improving the performance of a downstream model in some, but not all, cases.

Argument Mining Discourse Parsing

ChatGPT: A Meta-Analysis after 2.5 Months

no code implementations20 Feb 2023 Christoph Leiter, Ran Zhang, Yanran Chen, Jonas Belouadi, Daniil Larionov, Vivian Fresen, Steffen Eger

ChatGPT, a chatbot developed by OpenAI, has gained widespread popularity and media attention since its release in November 2022.

Chatbot Ethics

Transformers Go for the LOLs: Generating (Humourous) Titles from Scientific Abstracts End-to-End

1 code implementation20 Dec 2022 Yanran Chen, Steffen Eger

Our human evaluation suggests that our best end-to-end system performs similarly to human authors (but arguably slightly worse).

FairGer: Using NLP to Measure Support for Women and Migrants in 155 Years of German Parliamentary Debates

2 code implementations9 Oct 2022 Dominik Beese, Ole Pütz, Steffen Eger

We measure support with women and migrants in German political debates over the last 155 years.

Layer or Representation Space: What makes BERT-based Evaluation Metrics Robust?

1 code implementation COLING 2022 Doan Nam Long Vu, Nafise Sadat Moosavi, Steffen Eger

The evaluation of recent embedding-based evaluation metrics for text generation is primarily based on measuring their correlation with human evaluations on standard benchmarks.

Text Generation Word Embeddings

MENLI: Robust Evaluation Metrics from Natural Language Inference

1 code implementation15 Aug 2022 Yanran Chen, Steffen Eger

Recently proposed BERT-based evaluation metrics for text generation perform well on standard benchmarks but are vulnerable to adversarial attacks, e. g., relating to information correctness.

Adversarial Attack Adversarial Robustness +4

Reproducibility Issues for BERT-based Evaluation Metrics

1 code implementation30 Mar 2022 Yanran Chen, Jonas Belouadi, Steffen Eger

We find that reproduction of claims and results often fails because of (i) heavy undocumented preprocessing involved in the metrics, (ii) missing code and (iii) reporting weaker results for the baseline metrics.

Machine Translation Text Generation

Towards Explainable Evaluation Metrics for Natural Language Generation

1 code implementation21 Mar 2022 Christoph Leiter, Piyawat Lertvittayakumjorn, Marina Fomicheva, Wei Zhao, Yang Gao, Steffen Eger

We also provide a synthesizing overview over recent approaches for explainable machine translation metrics and discuss how they relate to those goals and properties.

Machine Translation Text Generation +2

Did AI get more negative recently?

2 code implementations28 Feb 2022 Dominik Beese, Begüm Altunbaş, Görkem Güzeler, Steffen Eger

We annotate over 1. 5 k papers from NLP and ML to train a SciBERT-based model to automatically predict the stance of a paper based on its title and abstract.

USCORE: An Effective Approach to Fully Unsupervised Evaluation Metrics for Machine Translation

1 code implementation21 Feb 2022 Jonas Belouadi, Steffen Eger

We show that our fully unsupervised metrics are effective, i. e., they beat supervised competitors on 4 out of our 5 evaluation datasets.

Machine Translation Parallel Corpus Mining +3

Constrained Density Matching and Modeling for Cross-lingual Alignment of Contextualized Representations

no code implementations31 Jan 2022 Wei Zhao, Steffen Eger

Multilingual representations pre-trained with monolingual data exhibit considerably unequal task performances across languages.

Attribute

DiscoScore: Evaluating Text Generation with BERT and Discourse Coherence

1 code implementation26 Jan 2022 Wei Zhao, Michael Strube, Steffen Eger

Still, recent BERT-based evaluation metrics are weak in recognizing coherence, and thus are not reliable in a way to spot the discourse-level improvements of those text generation systems.

Document Level Machine Translation Machine Translation +1

Better than Average: Paired Evaluation of NLP Systems

1 code implementation ACL 2021 Maxime Peyrard, Wei Zhao, Steffen Eger, Robert West

Evaluation in NLP is usually done by comparing the scores of competing systems independently averaged over a common set of test instances.

Constrained Density Matching and Modeling for Effective Contextualized Alignment

no code implementations29 Sep 2021 Wei Zhao, Steffen Eger

In this work, we analyze the limitations according to which previous alignments become very resource-intensive, \emph{viz.,} (i) the inability to sufficiently leverage data and (ii) that alignments are not trained properly.

Diachronic Analysis of German Parliamentary Proceedings: Ideological Shifts through the Lens of Political Biases

1 code implementation13 Aug 2021 Tobias Walter, Celina Kirschner, Steffen Eger, Goran Glavaš, Anne Lauscher, Simone Paolo Ponzetto

We analyze bias in historical corpora as encoded in diachronic distributional semantic models by focusing on two specific forms of bias, namely a political (i. e., anti-communism) and racist (i. e., antisemitism) one.

Diachronic Word Embeddings Word Embeddings

Graph Routing between Capsules

no code implementations22 Jun 2021 Yang Li, Wei Zhao, Erik Cambria, Suhang Wang, Steffen Eger

Therefore, in this paper, we introduce a new capsule network with graph routing to learn both relationships, where capsules in each layer are treated as the nodes of a graph.

Relation text-classification +1

CMCE at SemEval-2020 Task 1: Clustering on Manifolds of Contextualized Embeddings to Detect Historical Meaning Shifts

1 code implementation SEMEVAL 2020 David Rother, Thomas Haider, Steffen Eger

Remarkably, with only 10 dimensional MBERT embeddings (reduced from the original size of 768), our submitted model performs best on subtask 1 for English and ranks third in subtask 2 for English.

Change Detection Clustering +1

Probing Multilingual BERT for Genetic and Typological Signals

no code implementations COLING 2020 Taraka Rama, Lisa Beinborn, Steffen Eger

We probe the layers in multilingual BERT (mBERT) for phylogenetic and geographic language signals across 100 languages and compute language distances based on the mBERT representations.

regression

Vec2Sent: Probing Sentence Embeddings with Natural Language Generation

1 code implementation COLING 2020 Martin Kerscher, Steffen Eger

We introspect black-box sentence embeddings by conditionally generating from them with the objective to retrieve the underlying discrete sentence.

Sentence Sentence Embeddings +1

From Hero to Zéroe: A Benchmark of Low-Level Adversarial Attacks

1 code implementation12 Oct 2020 Steffen Eger, Yannik Benz

Adversarial attacks are label-preserving modifications to inputs of machine learning classifiers designed to fool machines but not humans.

Natural Language Inference Part-Of-Speech Tagging +1

How to Probe Sentence Embeddings in Low-Resource Languages: On Structural Design Choices for Probing Task Evaluation

1 code implementation CONLL 2020 Steffen Eger, Johannes Daxenberger, Iryna Gurevych

We then probe embeddings in a multilingual setup with design choices that lie in a 'stable region', as we identify for English, and find that results on English do not transfer to other languages.

Sentence Sentence Embeddings

On the Limitations of Cross-lingual Encoders as Exposed by Reference-Free Machine Translation Evaluation

1 code implementation ACL 2020 Wei Zhao, Goran Glavaš, Maxime Peyrard, Yang Gao, Robert West, Steffen Eger

We systematically investigate a range of metrics based on state-of-the-art cross-lingual semantic representations obtained with pretrained M-BERT and LASER.

Language Modelling Machine Translation +4

PO-EMO: Conceptualization, Annotation, and Modeling of Aesthetic Emotions in German and English Poetry

1 code implementation LREC 2020 Thomas Haider, Steffen Eger, Evgeny Kim, Roman Klinger, Winfried Menninghaus

Thus, we conceptualize a set of aesthetic emotions that are predictive of aesthetic appreciation in the reader, and allow the annotation of multiple labels per line to capture mixed emotions within their context.

Emotion Classification Emotion Recognition

Semantic Change and Emerging Tropes In a Large Corpus of New High German Poetry

1 code implementation WS 2019 Thomas Haider, Steffen Eger

Due to its semantic succinctness and novelty of expression, poetry is a great test bed for semantic change analysis.

Towards Scalable and Reliable Capsule Networks for Challenging NLP Applications

5 code implementations ACL 2019 Wei Zhao, Haiyun Peng, Steffen Eger, Erik Cambria, Min Yang

Obstacles hindering the development of capsule networks for challenging NLP applications include poor scalability to large output spaces and less reliable routing processes.

 Ranked #1 on Text Classification on RCV1 (P@1 metric)

General Classification Multi-Label Text Classification +1

Pitfalls in the Evaluation of Sentence Embeddings

no code implementations WS 2019 Steffen Eger, Andreas Rücklé, Iryna Gurevych

Our motivation is to challenge the current evaluation of sentence embeddings and to provide an easy-to-access reference for future research.

Sentence Sentence Embeddings

Does My Rebuttal Matter? Insights from a Major NLP Conference

1 code implementation NAACL 2019 Yang Gao, Steffen Eger, Ilia Kuznetsov, Iryna Gurevych, Yusuke Miyao

We then focus on the role of the rebuttal phase, and propose a novel task to predict after-rebuttal (i. e., final) scores from initial reviews and author responses.

4k

Text Processing Like Humans Do: Visually Attacking and Shielding NLP Systems

1 code implementation NAACL 2019 Steffen Eger, Gözde Gül Şahin, Andreas Rücklé, Ji-Ung Lee, Claudia Schulz, Mohsen Mesgar, Krishnkant Swarnkar, Edwin Simpson, Iryna Gurevych

Visual modifications to text are often used to obfuscate offensive comments in social media (e. g., "! d10t") or as a writing style ("1337" in "leet speak"), among other scenarios.

Adversarial Attack Sentence

Predicting Research Trends From Arxiv

1 code implementation7 Mar 2019 Steffen Eger, Chao Li, Florian Netzer, Iryna Gurevych

By extrapolation, we predict that these topics will remain lead problems/approaches in their fields in the short- and mid-term.

reinforcement-learning Reinforcement Learning (RL) +1

Is it Time to Swish? Comparing Deep Learning Activation Functions Across NLP tasks

1 code implementation EMNLP 2018 Steffen Eger, Paul Youssef, Iryna Gurevych

Activation functions play a crucial role in neural networks because they are the nonlinearities which have been attributed to the success story of deep learning.

Image Classification

One Size Fits All? A simple LSTM for non-literal token and construction-level classification

no code implementations COLING 2018 Erik-L{\^a}n Do Dinh, Steffen Eger, Iryna Gurevych

In this paper, we tackle four different tasks of non-literal language classification: token and construction level metaphor detection, classification of idiomatic use of infinitive-verb compounds, and classification of non-literal particle verbs.

Classification General Classification +1

Multi-Task Learning for Argumentation Mining in Low-Resource Settings

1 code implementation NAACL 2018 Claudia Schulz, Steffen Eger, Johannes Daxenberger, Tobias Kahse, Iryna Gurevych

We investigate whether and where multi-task learning (MTL) can improve performance on NLP problems related to argumentation mining (AM), in particular argument component identification.

Multi-Task Learning

Neural End-to-End Learning for Computational Argumentation Mining

2 code implementations ACL 2017 Steffen Eger, Johannes Daxenberger, Iryna Gurevych

Contrary to models that operate on the argument component level, we find that framing AM as dependency parsing leads to subpar performance results.

Dependency Parsing General Classification +1

EELECTION at SemEval-2017 Task 10: Ensemble of nEural Learners for kEyphrase ClassificaTION

1 code implementation SEMEVAL 2017 Steffen Eger, Erik-Lân Do Dinh, Ilia Kuznetsov, Masoud Kiaeeha, Iryna Gurevych

From these approaches, we created an ensemble of differently hyper-parameterized systems, achieving a micro-F1-score of 0. 63 on the test data.

General Classification

Complex Decomposition of the Negative Distance kernel

no code implementations5 Jan 2016 Tim vor der Brück, Steffen Eger, Alexander Mehler

Our evaluation shows that the power kernel produces F-scores that are comparable to the reference kernels, but is -- except for the linear kernel -- faster to compute.

Document Classification General Classification +2

On the Number of Many-to-Many Alignments of Multiple Sequences

no code implementations2 Nov 2015 Steffen Eger

We provide a new asymptotic formula for the case $S=\{(s_1,\ldots, s_N) \:|\: 1\le s_i\le 2\}$.

Cannot find the paper you are looking for? You can Submit a new open access paper.