no code implementations • VarDial (COLING) 2020 • Fernando Benites, Manuela Hürlimann, Pius von Däniken, Mark Cieliebak
We describe our approaches for the Social Media Geolocation (SMG) task at the VarDial Evaluation Campaign 2020.
no code implementations • EMNLP (newsum) 2021 • Don Tuggener, Margot Mieskes, Jan Deriu, Mark Cieliebak
Dialogue summarization is a long-standing task in the field of NLP, and several data sets with dialogues and associated human-written summaries of different styles exist.
no code implementations • 13 Oct 2023 • Claudio Paonessa, Yanick Schraner, Jan Deriu, Manuela Hürlimann, Manfred Vogel, Mark Cieliebak
This paper investigates the challenges in building Swiss German speech translation systems, specifically focusing on the impact of dialect diversity and differences between Swiss German and Standard German.
no code implementations • 6 Jun 2023 • Jan Deriu, Pius von Däniken, Don Tuggener, Mark Cieliebak
A major challenge in the field of Text Generation is evaluation: Human evaluations are cost-intensive, and automated metrics often display considerable disagreement with human judgments.
no code implementations • 30 May 2023 • Michel Plüss, Jan Deriu, Yanick Schraner, Claudio Paonessa, Julia Hartmann, Larissa Schmidt, Christian Scheller, Manuela Hürlimann, Tanja Samardžić, Manfred Vogel, Mark Cieliebak
We train an ASR model on the training set and achieve an average BLEU score of 74. 7 on the test set.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +4
no code implementations • 2 May 2023 • Anya Belz, Craig Thomson, Ehud Reiter, Gavin Abercrombie, Jose M. Alonso-Moral, Mohammad Arvan, Anouck Braggaar, Mark Cieliebak, Elizabeth Clark, Kees Van Deemter, Tanvi Dinkar, Ondřej Dušek, Steffen Eger, Qixiang Fang, Mingqi Gao, Albert Gatt, Dimitra Gkatzia, Javier González-Corbelle, Dirk Hovy, Manuela Hürlimann, Takumi Ito, John D. Kelleher, Filip Klubicka, Emiel Krahmer, Huiyuan Lai, Chris van der Lee, Yiru Li, Saad Mahamood, Margot Mieskes, Emiel van Miltenburg, Pablo Mosteiro, Malvina Nissim, Natalie Parde, Ondřej Plátek, Verena Rieser, Jie Ruan, Joel Tetreault, Antonio Toral, Xiaojun Wan, Leo Wanner, Lewis Watson, Diyi Yang
We report our efforts in identifying a set of previous human evaluations in NLP that would be suitable for a coordinated study examining what makes human evaluations in NLP more/less reproducible.
no code implementations • 24 Oct 2022 • Pius von Däniken, Jan Deriu, Don Tuggener, Mark Cieliebak
A major challenge in the field of Text Generation is evaluation because we lack a sound theory that can be leveraged to extract guidelines for evaluation campaigns.
1 code implementation • LREC 2022 • Michel Plüss, Manuela Hürlimann, Marc Cuny, Alla Stöckli, Nikolaos Kapotis, Julia Hartmann, Malgorzata Anna Ulasik, Christian Scheller, Yanick Schraner, Amit Jain, Jan Deriu, Mark Cieliebak, Manfred Vogel
We present SDS-200, a corpus of Swiss German dialectal speech with Standard German text translations, annotated with dialect, age, and gender information of the speakers.
1 code implementation • ACL 2022 • Jan Deriu, Don Tuggener, Pius von Däniken, Mark Cieliebak
This paper introduces an adversarial method to stress-test trained metrics to evaluate conversational dialogue systems.
1 code implementation • EMNLP 2020 • Jan Deriu, Don Tuggener, Pius von Däniken, Jon Ander Campos, Alvaro Rodrigo, Thiziri Belkacem, Aitor Soroa, Eneko Agirre, Mark Cieliebak
In this work, we introduce \emph{Spot The Bot}, a cost-efficient and robust evaluation framework that replaces human-bot conversations with conversations between bots.
no code implementations • ACL 2020 • Jon Ander Campos, Arantxa Otegi, Aitor Soroa, Jan Deriu, Mark Cieliebak, Eneko Agirre
We present DoQA, a dataset with 2, 437 dialogues and 10, 917 QA pairs.
no code implementations • 4 May 2020 • Jon Ander Campos, Arantxa Otegi, Aitor Soroa, Jan Deriu, Mark Cieliebak, Eneko Agirre
We present DoQA, a dataset with 2, 437 dialogues and 10, 917 QA pairs.
no code implementations • LREC 2020 • Malgorzata Anna Ulasik, Manuela H{\"u}rlimann, Fabian Germann, Esin Gedik, Fern Benites, o, Mark Cieliebak
In this paper, we present CEASR, a Corpus for Evaluating the quality of Automatic Speech Recognition (ASR).
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • LREC 2020 • Fern Benites, o, Gilbert Fran{\c{c}}ois Duivesteijn, Pius von D{\"a}niken, Mark Cieliebak
Transliteration is the process of expressing a proper name from a source language in the characters of a target language (e. g. from Cyrillic to Latin characters).
no code implementations • LREC 2020 • Don Tuggener, Pius von D{\"a}niken, Thomas Peetz, Mark Cieliebak
We present LEDGAR, a multilabel corpus of legal provisions in contracts.
no code implementations • ACL 2020 • Jan Deriu, Katsiaryna Mlynchyk, Philippe Schläpfer, Alvaro Rodrigo, Dirk von Grünigen, Nicolas Kaiser, Kurt Stockinger, Eneko Agirre, Mark Cieliebak
For this, we introduce an intermediate representation that is based on the logical query plan in a database called Operation Trees (OT).
no code implementations • WS 2019 • Jan Deriu, Mark Cieliebak
We present "AutoJudge", an automated evaluation method for conversational dialogue systems.
Open-Ended Question Answering Reinforcement Learning (RL) +1
no code implementations • ACL 2019 • Ahmad Aghaebrahimian, Mark Cieliebak
These words are used as a class description for more label-aware text classification.
1 code implementation • WS 2019 • Arno Schneuwly, Ralf Grubenmann, Séverine Rion Logean, Mark Cieliebak, Martin Jaggi
We study how language on social media is linked to diseases such as atherosclerotic heart disease (AHD), diabetes and various types of cancer.
no code implementations • WS 2019 • Fern Benites, o, Pius von D{\"a}niken, Mark Cieliebak
The goal was to identify dialects of Swiss German in GDI and Sumerian and Akkadian in CLI.
no code implementations • 10 May 2019 • Jan Deriu, Alvaro Rodrigo, Arantxa Otegi, Guillermo Echegoyen, Sophie Rosset, Eneko Agirre, Mark Cieliebak
We cover each class by introducing the main technologies developed for the dialogue systems and then by presenting the evaluation methods regarding this class.
1 code implementation • WS 2018 • Jan Milan Deriu, Mark Cieliebak
In this work, we present our system for Natural Language Generation where we control various aspects of the surface realization in order to increase the lexical variability of the utterances, such that they sound more diverse and interesting.
no code implementations • COLING 2018 • Fern Benites, o, Ralf Grubenmann, Pius von D{\"a}niken, Dirk von Gr{\"u}nigen, Jan Deriu, Mark Cieliebak
We describe our approaches used in the German Dialect Identification (GDI) task at the VarDial Evaluation Campaign 2018.
no code implementations • WS 2017 • Pius von D{\"a}niken, Mark Cieliebak
We present our system for the WNUT 2017 Named Entity Recognition challenge on Twitter data.
Ranked #22 on Named Entity Recognition (NER) on WNUT 2017
1 code implementation • SEMEVAL 2017 • Jan Milan Deriu, Mark Cieliebak
In this paper we propose a system for reranking answers for a given question.
no code implementations • SEMEVAL 2017 • Simon M{\"u}ller, Tobias Huonder, Jan Deriu, Mark Cieliebak
In this paper, we propose a classifier for predicting topic-specific sentiments of English Twitter messages.
no code implementations • WS 2017 • Mark Cieliebak, Jan Milan Deriu, Dominic Egger, Fatih Uzdilli
We use this new corpus and two existing corpora to provide state-of-the-art benchmarks for sentiment analysis in German: we implemented a CNN (based on the winning system of SemEval-2016) and a feature-based SVM and compare their performance on all three corpora.
no code implementations • WS 2017 • Jan Milan Deriu, Martin Weilenmann, Dirk Von Gruenigen, Mark Cieliebak
In this paper we investigate the cross-domain performance of a current state-of-the-art sentiment analysis systems.
1 code implementation • 7 Mar 2017 • Jan Deriu, Aurelien Lucchi, Valeria De Luca, Aliaksei Severyn, Simon Müller, Mark Cieliebak, Thomas Hofmann, Martin Jaggi
This paper presents a novel approach for multi-lingual sentiment classification in short texts.
no code implementations • LREC 2014 • Mark Cieliebak, Oliver D{\"u}rr, Fatih Uzdilli
In this paper, we analyze the quality of several commercial tools for sentiment detection.