no code implementations • 13 May 2024 • Elena Merdjanovska, Ansar Aynetdinov, Alan Akbik
Available training data for named entity recognition (NER) often contains a significant percentage of incorrect labels for entity types and entity boundaries.
1 code implementation • 29 Apr 2024 • Patrick Haller, Jonas Golde, Alan Akbik
Recent advancements in large language models (LLMs) have showcased their exceptional abilities across various tasks, such as code generation, problem-solving and reasoning.
Ranked #1 on Code Generation on PECC
2 code implementations • 5 Apr 2024 • Jacek Wiland, Max Ploner, Alan Akbik
We release the BEAR datasets and an open-source framework that implements the probing approach to the research community to facilitate the evaluation and development of LMs.
Ranked #1 on Factual probe on BEAR-probe
2 code implementations • 22 Mar 2024 • Max Dallabetta, Conrad Dobberstein, Adrian Breiding, Alan Akbik
This paper introduces Fundus, a user-friendly news scraper that enables users to obtain millions of high-quality news articles with just a few lines of code.
no code implementations • 21 Mar 2024 • Jonas Golde, Felix Hamborg, Alan Akbik
In an initial label interpretation learning phase, the model learns to interpret such verbalized descriptions of entity types.
no code implementations • 19 Feb 2024 • Mario Sänger, Samuele Garda, Xing David Wang, Leon Weber-Genzel, Pia Droop, Benedikt Fuchs, Alan Akbik, Ulf Leser
Instead, they are applied in the wild, i. e., on application-dependent text collections different from those used for the tools' training, varying, e. g., in focus, genre, style, and text type.
no code implementations • 30 Jan 2024 • Ansar Aynetdinov, Alan Akbik
Instruction-tuned Large Language Models (LLMs) have recently showcased remarkable advancements in their ability to generate fitting responses to natural language instructions.
1 code implementation • 24 Oct 2023 • Susanna Rücker, Alan Akbik
The CoNLL-03 corpus is arguably the most well-known and utilized benchmark dataset for named entity recognition (NER).
1 code implementation • 18 Sep 2023 • Jonas Golde, Patrick Haller, Felix Hamborg, Julian Risch, Alan Akbik
Here, a powerful LLM is prompted with a task description to generate labeled data that can be used to train a downstream NLP model.
no code implementations • 7 Sep 2023 • Patrick Haller, Ansar Aynetdinov, Alan Akbik
The demo will answer this question using a model fine-tuned on text representing each of the selected biases, allowing side-by-side comparison.
no code implementations • 30 Nov 2022 • Kishaloy Halder, Josip Krapac, Alan Akbik, Anthony Brew, Matti Lyra
In a series of experiments, we show that this yields a number of interesting benefits: (1) The resulting order induced by distances in the embedding space can be used to directly explain classification decisions.
no code implementations • NAACL (ACL) 2022 • Angelo Ziletti, Alan Akbik, Christoph Berns, Thomas Herold, Marion Legler, Martina Viell
Medical coding (MC) is an essential pre-requisite for reliable data retrieval and reporting.
1 code implementation • ACL 2021 • Matthias Vogt, Ulf Leser, Alan Akbik
We define and study the task of early sexual predator detection (eSPD) in chats, where the goal is to analyze a running chat from its beginning and predict grooming attempts as early and as accurately as possible.
1 code implementation • COLING 2020 • Kishaloy Halder, Alan Akbik, Josip Krapac, Roland Vollgraf
State-of-the-art approaches for text classification leverage a transformer architecture with a linear layer on top that outputs a class distribution for a given prediction problem.
1 code implementation • 13 Nov 2020 • Stefan Schweter, Alan Akbik
Current state-of-the-art approaches for named entity recognition (NER) typically consider text at the sentence-level and thus do not model information that crosses sentence boundaries.
2 code implementations • 17 Aug 2020 • Leon Weber, Mario Sänger, Jannes Münchmeyer, Maryam Habibi, Ulf Leser, Alan Akbik
Summary: Named Entity Recognition (NER) is an important step in biomedical information extraction pipelines.
1 code implementation • NAACL 2019 • Alan Akbik, Tanja Bergmann, Duncan Blythe, Kashif Rasul, Stefan Schweter, Rol Vollgraf,
We present FLAIR, an NLP framework designed to facilitate training and distribution of state-of-the-art sequence labeling, text classification and language models.
no code implementations • NAACL 2019 • Alan Akbik, Tanja Bergmann, Rol Vollgraf,
We make all code and pre-trained models available to the research community for use and reproduction.
1 code implementation • COLING 2018 • Alan Akbik, Duncan Blythe, Rol Vollgraf,
Recent advances in language modeling using recurrent neural networks have made it viable to model language as distributions over characters.
Ranked #2 on Chunking on Penn Treebank
no code implementations • 2 Mar 2018 • Duncan Blythe, Alan Akbik, Roland Vollgraf
Neural language models (LMs) are typically trained using only lexical features, such as surface forms of words.
no code implementations • EMNLP 2017 • Chenguang Wang, Alan Akbik, Laura Chiticariu, Yunyao Li, Fei Xia, Anbang Xu
Crowdsourcing has proven to be an effective method for generating labeled data for a range of NLP tasks.
no code implementations • EMNLP 2017 • Alan Akbik, Rol Vollgraf,
Previous works proposed annotation projection in parallel corpora to inexpensively generate treebanks or propbanks for new languages.
no code implementations • COLING 2016 • Alan Akbik, Laura Chiticariu, Marina Danilevsky, Yonas Kbrom, Yunyao Li, Huaiyu Zhu
We present PolyglotIE, a web-based tool for developing extractors that perform Information Extraction (IE) over multilingual data.
no code implementations • COLING 2016 • Alan Akbik, Xinyu Guan, Yunyao Li
To address these issues, we propose to manually alias TL verbs to existing English frames.
no code implementations • COLING 2016 • Alan Akbik, Yunyao Li
To overcome this challenge, we propose the use of instance-based learning that performs no explicit generalization, but rather extrapolates predictions from the most similar instances in the training data.
no code implementations • LREC 2014 • Johannes Kirschnick, Alan Akbik, Holmer Hemsen
The increasing availability and maturity of both scalable computing architectures and deep syntactic parsers is opening up new possibilities for Relation Extraction (RE) on large corpora of natural language text.
no code implementations • LREC 2014 • Alan Akbik, Thilo Michael
We present the Weltmodell, a commonsense knowledge base that was automatically generated from aggregated dependency parse fragments gathered from over 3. 5 million English language books.