Search Results for author: Marius Mosbach

Found 27 papers, 12 papers with code

Some steps towards the generation of diachronic WordNets

no code implementations • WS (NoDaLiDa) 2019 • Yuri Bizzoni, Marius Mosbach, Dietrich Klakow, Stefania Degaetano-Ortlieb

We apply hyperbolic embeddings to trace the dynamics of change of conceptual-semantic relationships in a large diachronic scientific corpus (200 years).

Paper
Add Code

Discourse-based Argument Segmentation and Annotation

no code implementations • ACL (ISA, IWCS) 2021 • Ekaterina Saveleva, Volha Petukhova, Marius Mosbach, Dietrich Klakow

We tested the widely used Penn Discourse Tree Bank full parser (Lin et al., 2010) and the state-of-the-art neural network NeuralEDUSeg (Wang et al., 2018) and XLNet (Yang et al., 2019) models on the two-stage discourse segmentation and discourse relation recognition.

Discourse Segmentation Segmentation

Paper
Add Code

incom.py 2.0 - Calculating Linguistic Distances and Asymmetries in Auditory Perception of Closely Related Languages

no code implementations • RANLP 2021 • Marius Mosbach, Irina Stenger, Tania Avgustinova, Bernd Möbius, Dietrich Klakow

We present an extended version of a tool developed for calculating linguistic distances and asymmetries in auditory perception of closely related languages.

regression

Paper
Add Code

Graph-based Argument Quality Assessment

no code implementations • RANLP 2021 • Ekaterina Saveleva, Volha Petukhova, Marius Mosbach, Dietrich Klakow

The paper presents a novel discourse-based approach to argument quality assessment defined as a graph classification task, where the depth of reasoning (argumentation) is evident from the number and type of detected discourse units and relations between them.

Graph Classification

Paper
Add Code

LLM2Vec: Large Language Models Are Secretly Powerful Text Encoders

2 code implementations • 9 Apr 2024 • Parishad BehnamGhader, Vaibhav Adlakha, Marius Mosbach, Dzmitry Bahdanau, Nicolas Chapados, Siva Reddy

We outperform encoder-only models by a large margin on word-level tasks and reach a new unsupervised state-of-the-art performance on the Massive Text Embeddings Benchmark (MTEB).

Contrastive Learning

221

Paper
Code

What explains the success of cross-modal fine-tuning with ORCA?

no code implementations • 20 Mar 2024 • Paloma García-de-Herreros, Vagrant Gautam, Philipp Slusallek, Dietrich Klakow, Marius Mosbach

ORCA (Shen et al., 2023) is a recent technique for cross-modal fine-tuning, i. e., applying pre-trained transformer models to modalities beyond their training data.

Paper
Add Code

The Impact of Demonstrations on Multilingual In-Context Learning: A Multidimensional Analysis

no code implementations • 20 Feb 2024 • Miaoran Zhang, Vagrant Gautam, Mingyang Wang, Jesujoba O. Alabi, Xiaoyu Shen, Dietrich Klakow, Marius Mosbach

Compared to work on monolingual (English) in-context learning, multilingual in-context learning is under-explored, and we lack an in-depth understanding of the role of demonstrations in this context.

In-Context Learning

Paper
Add Code

The Hidden Space of Transformer Language Adapters

no code implementations • 20 Feb 2024 • Jesujoba O. Alabi, Marius Mosbach, Matan Eyal, Dietrich Klakow, Mor Geva

We analyze the operation of transformer language adapters, which are small modules trained on top of a frozen language model to adapt its predictions to new target languages.

Language Modelling

Paper
Add Code

Large GPT-like Models are Bad Babies: A Closer Look at the Relationship between Linguistic Competence and Psycholinguistic Measures

no code implementations • 8 Nov 2023 • Julius Steuer, Marius Mosbach, Dietrich Klakow

Research on the cognitive plausibility of language models (LMs) has so far mostly concentrated on modelling psycholinguistic response variables such as reading times, gaze durations and N400/P600 EEG signals, while mostly leaving out the dimension of what Mahowald et al. (2023) described as formal and functional linguistic competence, and developmental plausibility.

EEG

Paper
Add Code

Weaker Than You Think: A Critical Look at Weakly Supervised Learning

1 code implementation • 27 May 2023 • Dawei Zhu, Xiaoyu Shen, Marius Mosbach, Andreas Stephan, Dietrich Klakow

In this paper, we revisit the setup of these approaches and find that the benefits brought by these approaches are significantly overestimated.

Weakly-supervised Learning

Paper
Code

Few-shot Fine-tuning vs. In-context Learning: A Fair Comparison and Evaluation

1 code implementation • 26 May 2023 • Marius Mosbach, Tiago Pimentel, Shauli Ravfogel, Dietrich Klakow, Yanai Elazar

In this paper, we compare the generalization of few-shot fine-tuning and in-context learning to challenge datasets, while controlling for the models used, the number of examples, and the number of parameters, ranging from 125M to 30B.

Domain Generalization In-Context Learning

Paper
Code

Fusing Sentence Embeddings Into LSTM-based Autoregressive Language Models

1 code implementation • 4 Aug 2022 • Vilém Zouhar, Marius Mosbach, Dietrich Klakow

We present an LSTM-based autoregressive language model which uses prefix embeddings (from a pretrained masked language model) via fusion (e. g. concatenation) to obtain a richer context representation for language modelling.

Language Modelling Sentence +1

Paper
Code

Measuring Causal Effects of Data Statistics on Language Model's `Factual' Predictions

no code implementations • 28 Jul 2022 • Yanai Elazar, Nora Kassner, Shauli Ravfogel, Amir Feder, Abhilasha Ravichander, Marius Mosbach, Yonatan Belinkov, Hinrich Schütze, Yoav Goldberg

Our causal framework and our results demonstrate the importance of studying datasets and the benefits of causality for understanding NLP models.

Paper
Add Code

StereoKG: Data-Driven Knowledge Graph Construction for Cultural Knowledge and Stereotypes

1 code implementation • NAACL (WOAH) 2022 • Awantee Deshpande, Dana Ruiter, Marius Mosbach, Dietrich Klakow

Analyzing ethnic or religious bias is important for improving fairness, accountability, and transparency of natural language processing models.

Fairness graph construction +2

Paper
Code

MCSE: Multimodal Contrastive Learning of Sentence Embeddings

1 code implementation • NAACL 2022 • Miaoran Zhang, Marius Mosbach, David Ifeoluwa Adelani, Michael A. Hedderich, Dietrich Klakow

Learning semantically meaningful sentence embeddings is an open problem in natural language processing.

Contrastive Learning Semantic Textual Similarity +3

Paper
Code

Adapting Pre-trained Language Models to African Languages via Multilingual Adaptive Fine-Tuning

1 code implementation • COLING 2022 • Jesujoba O. Alabi, David Ifeoluwa Adelani, Marius Mosbach, Dietrich Klakow

Multilingual pre-trained language models (PLMs) have demonstrated impressive performance on several downstream tasks for both high-resourced and low-resourced languages.

NER Sentiment Analysis +5

Paper
Code

Knowledge Base Index Compression via Dimensionality and Precision Reduction

1 code implementation • SpaNLP (ACL) 2022 • Vilém Zouhar, Marius Mosbach, Miaoran Zhang, Dietrich Klakow

Finally, we show that it is possible to combine PCA with using 1bit per dimension.

Dimensionality Reduction Question Answering +1

Paper
Code

Artefact Retrieval: Overview of NLP Models with Knowledge Base Access

no code implementations • AKBC Workshop CSKB 2021 • Vilém Zouhar, Marius Mosbach, Debanjali Biswas, Dietrich Klakow

Many NLP models gain performance by having access to a knowledge base.

Fact Checking Question Answering +1

Paper
Add Code

Do Acoustic Word Embeddings Capture Phonological Similarity? An Empirical Study

1 code implementation • 16 Jun 2021 • Badr M. Abdullah, Marius Mosbach, Iuliia Zaitova, Bernd Möbius, Dietrich Klakow

Our experiments show that (1) the distance in the embedding space in the best cases only moderately correlates with phonological distance, and (2) improving the performance on the word discrimination task does not necessarily yield models that better reflect word phonological similarity.

Word Embeddings

Paper
Code

A Closer Look at Linguistic Knowledge in Masked Language Models: The Case of Relative Clauses in American English

1 code implementation • COLING 2020 • Marius Mosbach, Stefania Degaetano-Ortlieb, Marie-Pauline Krielke, Badr M. Abdullah, Dietrich Klakow

Transformer-based language models achieve high performance on various tasks, but we still lack understanding of the kind of linguistic knowledge they learn and rely on.

Sentence

Paper
Code

Fusion Models for Improved Visual Captioning

no code implementations • 28 Oct 2020 • Marimuthu Kalimuthu, Aditya Mogadala, Marius Mosbach, Dietrich Klakow

Building on these recent developments, and with the aim of improving the quality of generated captions, the contribution of our work in this paper is two-fold: First, we propose a generic multimodal model fusion framework for caption generation as well as emendation where we utilize different fusion strategies to integrate a pretrained Auxiliary Language Model (AuxLM) within the traditional encoder-decoder visual captioning frameworks.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +5

Paper
Add Code

On the Interplay Between Fine-tuning and Sentence-level Probing for Linguistic Knowledge in Pre-trained Transformers

no code implementations • EMNLP (BlackboxNLP) 2020 • Marius Mosbach, Anna Khokhlova, Michael A. Hedderich, Dietrich Klakow

Our analysis reveals that while fine-tuning indeed changes the representations of a pre-trained model and these changes are typically larger for higher layers, only in very few cases, fine-tuning has a positive effect on probing accuracy that is larger than just using the pre-trained model with a strong pooling method.

Sentence

Paper
Add Code

Sparse Graph to Sequence Learning for Vision Conditioned Long Textual Sequence Generation

no code implementations • 12 Jul 2020 • Aditya Mogadala, Marius Mosbach, Dietrich Klakow

Generating longer textual sequences when conditioned on the visual information is an interesting problem to explore.

Graph-to-Sequence Sentence +1

Paper
Add Code

On the Stability of Fine-tuning BERT: Misconceptions, Explanations, and Strong Baselines

2 code implementations • ICLR 2021 • Marius Mosbach, Maksym Andriushchenko, Dietrich Klakow

Fine-tuning pre-trained transformer-based language models such as BERT has become a common practice dominating leaderboards across various NLP benchmarks.

Misconceptions

128

Paper
Code

incom.py - A Toolbox for Calculating Linguistic Distances and Asymmetries between Related Languages

no code implementations • RANLP 2019 • Marius Mosbach, Irina Stenger, Tania Avgustinova, Dietrich Klakow

Languages may be differently distant from each other and their mutual intelligibility may be asymmetric.

Paper
Add Code

On the security relevance of weights in deep learning

no code implementations • 8 Feb 2019 • Kathrin Grosse, Thomas A. Trost, Marius Mosbach, Michael Backes, Dietrich Klakow

Recently, a weight-based attack on stochastic gradient descent inducing overfitting has been proposed.

Paper
Add Code

Logit Pairing Methods Can Fool Gradient-Based Attacks

1 code implementation • 29 Oct 2018 • Marius Mosbach, Maksym Andriushchenko, Thomas Trost, Matthias Hein, Dietrich Klakow

Recently, Kannan et al. [2018] proposed several logit regularization methods to improve the adversarial robustness of classifiers.

Adversarial Robustness

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.