Search Results for author: Jannik Strötgen

Found 19 papers, 7 papers with code

A Study on Entity Linking Across Domains”:" Which Data is Best for Fine-Tuning?

no code implementations RepL4NLP (ACL) 2022 Hassan Soliman, Heike Adel, Mohamed H. Gad-Elrab, Dragan Milchevski, Jannik Strötgen

In particular, we represent the entities of different KGs in a joint vector space and address the questions of which data is best suited for creating and fine-tuning that space, and whether fine-tuning harms performance on the general domain.

Entity Linking

Discourse-Aware In-Context Learning for Temporal Expression Normalization

no code implementations11 Apr 2024 Akash Kumar Gautam, Lukas Lange, Jannik Strötgen

In this work, we explore the feasibility of proprietary and open-source large language models (LLMs) for TE normalization using in-context learning to inject task, document, and example information into the model.

In-Context Learning

GradSim: Gradient-Based Language Grouping for Effective Multilingual Training

no code implementations23 Oct 2023 Mingyang Wang, Heike Adel, Lukas Lange, Jannik Strötgen, Hinrich Schütze

However, not all languages positively influence each other and it is an open research question how to select the most suitable set of languages for multilingual training and avoid negative interference among languages whose characteristics or data distributions are not compatible.

Sentiment Analysis

TADA: Efficient Task-Agnostic Domain Adaptation for Transformers

1 code implementation22 May 2023 Chia-Chien Hung, Lukas Lange, Jannik Strötgen

Our broad evaluation in 4 downstream tasks for 14 domains across single- and multi-domain setups and high- and low-resource scenarios reveals that TADA is an effective and efficient alternative to full domain-adaptive pre-training and adapters for domain adaptation, while not introducing additional parameters or complex training steps.

Domain Adaptation

NLNDE at SemEval-2023 Task 12: Adaptive Pretraining and Source Language Selection for Low-Resource Multilingual Sentiment Analysis

no code implementations28 Apr 2023 Mingyang Wang, Heike Adel, Lukas Lange, Jannik Strötgen, Hinrich Schütze

In this work, we propose to leverage language-adaptive and task-adaptive pretraining on African texts and study transfer learning with source language selection on top of an African language-centric pretrained language model.

Language Modelling Sentiment Analysis +1

Multilingual Normalization of Temporal Expressions with Masked Language Models

1 code implementation20 May 2022 Lukas Lange, Jannik Strötgen, Heike Adel, Dietrich Klakow

The detection and normalization of temporal expressions is an important task and preprocessing step for many applications.

Language Modelling Masked Language Modeling

CLIN-X: pre-trained language models and a study on cross-task transfer for concept extraction in the clinical domain

1 code implementation16 Dec 2021 Lukas Lange, Heike Adel, Jannik Strötgen, Dietrich Klakow

The field of natural language processing (NLP) has recently seen a large change towards using pre-trained language models for solving almost any task.

Clinical Concept Extraction Sentence +1

Enriched Attention for Robust Relation Extraction

no code implementations22 Apr 2021 Heike Adel, Jannik Strötgen

The performance of relation extraction models has increased considerably with the rise of neural networks.

Relation Relation Extraction +1

To Share or not to Share: Predicting Sets of Sources for Model Transfer Learning

1 code implementation EMNLP 2021 Lukas Lange, Jannik Strötgen, Heike Adel, Dietrich Klakow

For this, we study the effects of model transfer on sequence labeling across various domains and tasks and show that our methods based on model similarity and support vector machines are able to predict promising sources, resulting in performance increases of up to 24 F1 points.

text similarity Transfer Learning

FAME: Feature-Based Adversarial Meta-Embeddings for Robust Input Representations

1 code implementation EMNLP 2021 Lukas Lange, Heike Adel, Jannik Strötgen, Dietrich Klakow

Combining several embeddings typically improves performance in downstream tasks as different embeddings encode different information.

NER POS +4

NLNDE at CANTEMIST: Neural Sequence Labeling and Parsing Approaches for Clinical Concept Extraction

no code implementations23 Oct 2020 Lukas Lange, Xiang Dai, Heike Adel, Jannik Strötgen

The recognition and normalization of clinical information, such as tumor morphology mentions, is an important, but complex process consisting of multiple subtasks.

Clinical Concept Extraction

NLNDE: The Neither-Language-Nor-Domain-Experts' Way of Spanish Medical Document De-Identification

no code implementations2 Jul 2020 Lukas Lange, Heike Adel, Jannik Strötgen

Natural language processing has huge potential in the medical domain which recently led to a lot of research in this field.

De-identification

Closing the Gap: Joint De-Identification and Concept Extraction in the Clinical Domain

1 code implementation ACL 2020 Lukas Lange, Heike Adel, Jannik Strötgen

Exploiting natural language processing in the clinical domain requires de-identification, i. e., anonymization of personal information in texts.

De-identification

On the Choice of Auxiliary Languages for Improved Sequence Tagging

no code implementations WS 2020 Lukas Lange, Heike Adel, Jannik Strötgen

Recent work showed that embeddings from related languages can improve the performance of sequence tagging, even for monolingual models.

Part-Of-Speech Tagging

Cannot find the paper you are looking for? You can Submit a new open access paper.