Search Results for author: Jan Šnajder

Found 24 papers, 7 papers with code

Claim Check-Worthiness Detection: How Well do LLMs Grasp Annotation Guidelines?

no code implementations18 Apr 2024 Laura Majer, Jan Šnajder

The increasing threat of disinformation calls for automating parts of the fact-checking pipeline.

Fact Checking

From Robustness to Improved Generalization and Calibration in Pre-trained Language Models

no code implementations31 Mar 2024 Josip Jukić, Jan Šnajder

Enhancing generalization and uncertainty quantification in pre-trained language models (PLMs) is crucial for their effectiveness and reliability.

Domain Generalization Uncertainty Quantification

LLMs for Targeted Sentiment in News Headlines: Exploring Different Levels of Prompt Prescriptiveness

no code implementations1 Mar 2024 Jana Juroš, Laura Majer, Jan Šnajder

Drawing parallels with annotation paradigms for subjective tasks, we explore the influence of prompt design on the performance of LLMs for TSA of news headlines.

In-Context Learning Sentiment Analysis +1

Are ELECTRA's Sentence Embeddings Beyond Repair? The Case of Semantic Textual Similarity

no code implementations20 Feb 2024 Ivan Rep, David Dukić, Jan Šnajder

While BERT produces high-quality sentence embeddings, its pre-training computational cost is a significant drawback.

Semantic Textual Similarity Sentence +3

Looking Right is Sometimes Right: Investigating the Capabilities of Decoder-only LLMs for Sequence Labeling

no code implementations25 Jan 2024 David Dukić, Jan Šnajder

While fine-tuned MLM-based encoders consistently outperform causal language modeling decoders of comparable size, recent decoder-only large language models (LLMs) perform on par with smaller MLM-based encoders.

Causal Language Modeling Language Modelling +2

Out-of-Distribution Detection by Leveraging Between-Layer Transformation Smoothness

1 code implementation4 Oct 2023 Fran Jelenić, Josip Jukić, Martin Tutek, Mate Puljiz, Jan Šnajder

Effective out-of-distribution (OOD) detection is crucial for reliable machine learning models, yet most current methods are limited in practical use due to requirements like access to training data or intervention in training.

Out-of-Distribution Detection Out of Distribution (OOD) Detection +2

Leveraging Open Information Extraction for More Robust Domain Transfer of Event Trigger Detection

1 code implementation23 May 2023 David Dukić, Kiril Gashteovski, Goran Glavaš, Jan Šnajder

We address the problem of negative transfer in TD by coupling triggers between domains using subject-object relations obtained from a rule-based open information extraction (OIE) system.

Event Detection Language Modelling +2

Parameter-Efficient Language Model Tuning with Active Learning in Low-Resource Settings

1 code implementation23 May 2023 Josip Jukić, Jan Šnajder

Pre-trained language models (PLMs) have ignited a surge in demand for effective fine-tuning techniques, particularly in low-resource domains and languages.

Active Learning Language Modelling +2

Paragraph-level Citation Recommendation based on Topic Sentences as Queries

no code implementations20 May 2023 Zoran Medić, Jan Šnajder

Citation recommendation (CR) models may help authors find relevant articles at various stages of the paper writing process.

Citation Recommendation Sentence

On Dataset Transferability in Active Learning for Transformers

1 code implementation16 May 2023 Fran Jelenić, Josip Jukić, Nina Drobac, Jan Šnajder

We link the AL dataset transferability to the similarity of instances queried by the different PLMs and show that AL methods with similar acquisition sequences produce highly transferable datasets regardless of the models used.

Active Learning text-classification +1

Data Augmentation for Neural NLP

no code implementations22 Feb 2023 Domagoj Pluščec, Jan Šnajder

Data scarcity is a problem that occurs in languages and tasks where we do not have large amounts of labeled data but want to use state-of-the-art models.

Data Augmentation

You Are What You Talk About: Inducing Evaluative Topics for Personality Analysis

no code implementations1 Feb 2023 Josip Jukić, Iva Vukojević, Jan Šnajder

Expressing attitude or stance toward entities and concepts is an integral part of human behavior and personality.

Topic Models

Smooth Sailing: Improving Active Learning for Pre-trained Language Models with Representation Smoothness Analysis

no code implementations20 Dec 2022 Josip Jukić, Jan Šnajder

Developed to alleviate prohibitive labeling costs, active learning (AL) methods aim to reduce label complexity in supervised learning.

Active Learning

Easy to Decide, Hard to Agree: Reducing Disagreements Between Saliency Methods

no code implementations15 Nov 2022 Josip Jukić, Martin Tutek, Jan Šnajder

By connecting our findings to instance categories based on training dynamics, we show that the agreement of saliency method explanations is very low for easy-to-learn instances.

ALANNO: An Active Learning Annotation System for Mortals

no code implementations11 Nov 2022 Josip Jukić, Fran Jelenić, Miroslav Bićanić, Jan Šnajder

Supervised machine learning has become the cornerstone of today's data-driven society, increasing the need for labeled data.

Active Learning Management

Large-scale Evaluation of Transformer-based Article Encoders on the Task of Citation Recommendation

1 code implementation sdp (COLING) 2022 Zoran Medić, Jan Šnajder

As a remedy for the limitations of the existing benchmarks, we propose a new benchmark dataset for evaluating scientific article representations: Multi-Domain Citation Recommendation dataset (MDCR), which covers different scientific fields and contains challenging candidate pools.

Citation Recommendation Retrieval

A Topic Coverage Approach to Evaluation of Topic Models

1 code implementation11 Dec 2020 Damir Korenčić, Strahil Ristov, Jelena Repar, Jan Šnajder

When topic models are used for discovery of topics in text collections, a question that arises naturally is how well the model-induced topics correspond to topics of interest to the analyst.

Topic coverage

PANDORA Talks: Personality and Demographics on Reddit

no code implementations NAACL (SocialNLP) 2021 Matej Gjurković, Mladen Karan, Iva Vukojević, Mihaela Bošnjak, Jan Šnajder

Personality and demographics are important variables in social sciences, while in NLP they can aid in interpretability and removal of societal biases.

Gender Classification

Not Just Depressed: Bipolar Disorder Prediction on Reddit

no code implementations12 Nov 2018 Ivan Sekulić, Matej Gjurković, Jan Šnajder

Bipolar disorder, an illness characterized by manic and depressive episodes, affects more than 60 million people worldwide.

Social Media Argumentation Mining: The Quest for Deliberateness in Raucousness

no code implementations31 Dec 2016 Jan Šnajder

Argumentation mining from social media content has attracted increasing attention.

Position

Cannot find the paper you are looking for? You can Submit a new open access paper.