Search Results for author: Kalina Bontcheva

Found 64 papers, 13 papers with code

Large Language Models Offer an Alternative to the Traditional Approach of Topic Modelling

no code implementations • 24 Mar 2024 • Yida Mu, Chun Dong, Kalina Bontcheva, Xingyi Song

Topic modelling, as a well-established unsupervised technique, has found extensive use in automatically detecting significant topics within a corpus of documents.

Paper
Add Code

Lying Blindly: Bypassing ChatGPT's Safeguards to Generate Hard-to-Detect Disinformation Claims at Scale

no code implementations • 13 Feb 2024 • Freddy Heppell, Mehmet E. Bakir, Kalina Bontcheva

As Large Language Models (LLMs) become more proficient, their misuse in large-scale viral disinformation campaigns is a growing concern.

Paper
Add Code

Don't Waste a Single Annotation: Improving Single-Label Classifiers Through Soft Labels

no code implementations • 9 Nov 2023 • Ben Wu, Yue Li, Yida Mu, Carolina Scarton, Kalina Bontcheva, Xingyi Song

In this paper, we address the limitations of the common data annotation and training methods for objective single-label classification tasks.

Paper
Add Code

Analysing State-Backed Propaganda Websites: a New Dataset and Linguistic Study

1 code implementation • 21 Oct 2023 • Freddy Heppell, Kalina Bontcheva, Carolina Scarton

This paper analyses two hitherto unstudied sites sharing state-backed disinformation, Reliable Recent News (rrn. world) and WarOnFakes (waronfakes. com), which publish content in Arabic, Chinese, English, French, German, and Spanish.

Paper
Code

Examining Temporal Bias in Abusive Language Detection

no code implementations • 25 Sep 2023 • Mali Jin, Yida Mu, Diana Maynard, Kalina Bontcheva

The use of abusive language online has become an increasingly pervasive problem that damages both individuals and society, with effects ranging from psychological harm right through to escalation to real-life violence and even death.

Abusive Language

Paper
Add Code

Examining the Limitations of Computational Rumor Detection Models Trained on Static Datasets

no code implementations • 20 Sep 2023 • Yida Mu, Xingyi Song, Kalina Bontcheva, Nikolaos Aletras

A crucial aspect of a rumor detection model is its ability to generalize, particularly its ability to detect emerging, previously unknown rumors.

Paper
Add Code

Detecting Misinformation with LLM-Predicted Credibility Signals and Weak Supervision

no code implementations • 14 Sep 2023 • João A. Leite, Olesya Razuvayevskaya, Kalina Bontcheva, Carolina Scarton

Credibility signals represent a wide range of heuristics that are typically used by journalists and fact-checkers to assess the veracity of online content.

Misinformation

Paper
Add Code

Comparison between parameter-efficient techniques and full fine-tuning: A case study on multilingual news article classification

1 code implementation • 14 Aug 2023 • Olesya Razuvayevskaya, Ben Wu, Joao A. Leite, Freddy Heppell, Ivan Srba, Carolina Scarton, Kalina Bontcheva, Xingyi Song

Adapters and Low-Rank Adaptation (LoRA) are parameter-efficient fine-tuning techniques designed to make the training of language models more efficient.

Multilingual text classification text-classification +1

Paper
Code

Finding Already Debunked Narratives via Multistage Retrieval: Enabling Cross-Lingual, Cross-Dataset and Zero-Shot Learning

no code implementations • 10 Aug 2023 • Iknoor Singh, Carolina Scarton, Xingyi Song, Kalina Bontcheva

The task of retrieving already debunked narratives aims to detect stories that have already been fact-checked.

Fact Checking Misinformation +3

Paper
Add Code

Navigating Prompt Complexity for Zero-Shot Classification: A Study of Large Language Models in Computational Social Science

no code implementations • 23 May 2023 • Yida Mu, Ben P. Wu, William Thorne, Ambrose Robinson, Nikolaos Aletras, Carolina Scarton, Kalina Bontcheva, Xingyi Song

Instruction-tuned Large Language Models (LLMs) have exhibited impressive language understanding and the capacity to generate responses that follow specific prompts.

Zero-Shot Learning

Paper
Add Code

A Large-Scale Comparative Study of Accurate COVID-19 Information versus Misinformation

no code implementations • 10 Apr 2023 • Yida Mu, Ye Jiang, Freddy Heppell, Iknoor Singh, Carolina Scarton, Kalina Bontcheva, Xingyi Song

This motivated us to carry out a comparative study of the characteristics of COVID-19 misinformation versus those of accurate COVID-19 information through a large-scale computational analysis of over 242 million tweets.

Misinformation

Paper
Add Code

Examining Temporalities on Stance Detection towards COVID-19 Vaccination

no code implementations • 10 Apr 2023 • Yida Mu, Mali Jin, Kalina Bontcheva, Xingyi Song

It is crucial for policymakers to have a comprehensive understanding of the public's stance towards vaccination on a large scale.

Stance Classification Stance Detection

Paper
Add Code

SheffieldVeraAI at SemEval-2023 Task 3: Mono and multilingual approaches for news genre, topic and persuasion technique classification

1 code implementation • 16 Mar 2023 • Ben Wu, Olesya Razuvayevskaya, Freddy Heppell, João A. Leite, Carolina Scarton, Kalina Bontcheva, Xingyi Song

For Subtask 2 (Framing), we achieved first place in 3 languages, and the best average rank across all the languages, by using two separate ensembles: a monolingual RoBERTa-MUPPETLARGE and an ensemble of XLM-RoBERTaLARGE with adapters and task adaptive pretraining.

Paper
Code

It's about Time: Rethinking Evaluation on Rumor Detection Benchmarks using Chronological Splits

1 code implementation • 6 Feb 2023 • Yida Mu, Kalina Bontcheva, Nikolaos Aletras

New events emerge over time influencing the topics of rumors in social media.

Paper
Code

VaxxHesitancy: A Dataset for Studying Hesitancy towards COVID-19 Vaccination on Twitter

1 code implementation • 17 Jan 2023 • Yida Mu, Mali Jin, Charlie Grimshaw, Carolina Scarton, Kalina Bontcheva, Xingyi Song

Annotated data is also necessary for training data-driven models for more nuanced analysis of attitudes towards vaccination.

Language Modelling

Paper
Code

On the Impact of Temporal Concept Drift on Model Explanations

1 code implementation • 17 Oct 2022 • Zhixue Zhao, George Chrysostomou, Kalina Bontcheva, Nikolaos Aletras

Explanation faithfulness of model predictions in natural language processing is typically evaluated on held-out data from the same temporal distribution as the training data (i. e. synchronous settings).

Text Classification

Paper
Code

Classifying COVID-19 vaccine narratives

no code implementations • 18 Jul 2022 • Yue Li, Carolina Scarton, Xingyi Song, Kalina Bontcheva

This paper addresses the need for monitoring and analysing vaccine narratives online by introducing a novel vaccine narrative classification task, which categorises COVID-19 vaccine claims into one of seven categories.

Data Augmentation

Paper
Add Code

Categorising Fine-to-Coarse Grained Misinformation: An Empirical Study of COVID-19 Infodemic

no code implementations • 22 Jun 2021 • Ye Jiang, Xingyi Song, Carolina Scarton, Ahmet Aker, Kalina Bontcheva

In this paper, we introduce a fine-grained annotated misinformation tweets dataset including social behaviours annotation (e. g. comment or question to the misinformation).

Misinformation

Paper
Add Code

European Language Grid: A Joint Platform for the European Language Technology Community

no code implementations • EACL 2021 • Georg Rehm, Stelios Piperidis, Kalina Bontcheva, Jan Hajic, Victoria Arranz, Andrejs Vasi{\c{l}}jevs, Gerhard Backfried, Jose Manuel Gomez-Perez, Ulrich Germann, R{\'e}mi Calizzano, Nils Feldhus, Stefanie Hegele, Florian Kintzel, Katrin Marheinecke, Julian Moreno-Schneider, Dimitris Galanis, Penny Labropoulou, Miltos Deligiannis, Katerina Gkirtzou, Athanasia Kolovou, Dimitris Gkoumas, Leon Voukoutis, Ian Roberts, Jana Hamrlova, Dusan Varis, Lukas Kacena, Khalid Choukri, Val{\'e}rie Mapelli, Micka{\"e}l Rigault, Julija Melnika, Miro Janosik, Katja Prinz, Andres Garcia-Silva, Cristian Berrio, Ondrej Klejch, Steve Renals

Europe is a multilingual society, in which dozens of languages are spoken.

Paper
Add Code

MP Twitter Engagement and Abuse Post-first COVID-19 Lockdown in the UK: White Paper

no code implementations • 4 Mar 2021 • Tracie Farrell, Mehmet Bakir, Kalina Bontcheva

This work covers the period of June to December 2020 and analyses Twitter abuse in replies to UK MPs.

Paper
Add Code

Multistage BiCross encoder for multilingual access to COVID-19 health information

1 code implementation • 8 Jan 2021 • Iknoor Singh, Carolina Scarton, Kalina Bontcheva

The Coronavirus (COVID-19) pandemic has led to a rapidly growing 'infodemic' of health information online.

Retrieval

Paper
Code

Measuring What Counts: The case of Rumour Stance Classification

no code implementations • Asian Chapter of the Association for Computational Linguistics 2020 • Carolina Scarton, Diego F. Silva, Kalina Bontcheva

This paper specifically questions the evaluation metrics used in these shared tasks.

Classification General Classification +3

Paper
Add Code

Toxic Language Detection in Social Media for Brazilian Portuguese: New Dataset and Multilingual Analysis

1 code implementation • Asian Chapter of the Association for Computational Linguistics 2020 • João A. Leite, Diego F. Silva, Kalina Bontcheva, Carolina Scarton

Therefore, identifying these comments is an important task for studying and preventing the proliferation of toxicity in social media.

Ranked #1 on Hate Speech Detection on ToLD-Br

Hate Speech Detection Multi-Label Classification +1

Paper
Code

Classification Aware Neural Topic Model and its Application on a New COVID-19 Disinformation Corpus

no code implementations • 5 Jun 2020 • Xingyi Song, Johann Petrak, Ye Jiang, Iknoor Singh, Diana Maynard, Kalina Bontcheva

The explosion of disinformation accompanying the COVID-19 pandemic has overloaded fact-checkers and media worldwide, and brought a new major challenge to government responses worldwide.

Fact Checking General Classification

Paper
Add Code

Using Deep Neural Networks with Intra- and Inter-Sentence Context to Classify Suicidal Behaviour

no code implementations • LREC 2020 • Xingyi Song, Johnny Downs, Sumithra Velupillai, Rachel Holden, Maxim Kikoler, Kalina Bontcheva, Rina Dutta, Angus Roberts

Identifying statements related to suicidal behaviour in psychiatric electronic health records (EHRs) is an important step when modeling that behaviour, and when assessing suicide risk.

Classification General Classification +1

Paper
Add Code

Measuring the Impact of Readability Features in Fake News Detection

no code implementations • LREC 2020 • Roney Santos, Gabriela Pedro, Sidney Leal, Oto Vale, Thiago Pardo, Kalina Bontcheva, Carolina Scarton

The proliferation of fake news is a current issue that influences a number of important areas of society, such as politics, economy and health.

Classification Fake News Detection +1

Paper
Add Code

Towards an Interoperable Ecosystem of AI and LT Platforms: A Roadmap for the Implementation of Different Levels of Interoperability

1 code implementation • LREC 2020 • Georg Rehm, Dimitrios Galanis, Penny Labropoulou, Stelios Piperidis, Martin Welß, Ricardo Usbeck, Joachim köhler, Miltos Deligiannis, Katerina Gkirtzou, Johannes Fischer, Christian Chiarcos, Nils Feldhus, Julián Moreno-Schneider, Florian Kintzel, Elena Montiel, Víctor Rodríguez Doncel, John P. McCrae, David Laqua, Irina Patricia Theile, Christian Dittmar, Kalina Bontcheva, Ian Roberts, Andrejs Vasiljevs, Andis Lagzdiņš

With regard to the wider area of AI/LT platform interoperability, we concentrate on two core aspects: (1) cross-platform search and discovery of resources and services; (2) composition of cross-platform service workflows.

Paper
Code

The European Language Technology Landscape in 2020: Language-Centric and Human-Centric AI for Cross-Cultural Communication in Multilingual Europe

no code implementations • LREC 2020 • Georg Rehm, Katrin Marheinecke, Stefanie Hegele, Stelios Piperidis, Kalina Bontcheva, Jan Hajič, Khalid Choukri, Andrejs Vasiļjevs, Gerhard Backfried, Christoph Prinz, José Manuel Gómez Pérez, Luc Meertens, Paul Lukowicz, Josef van Genabith, Andrea Lösch, Philipp Slusallek, Morten Irgens, Patrick Gatellier, Joachim köhler, Laure Le Bars, Dimitra Anastasiou, Albina Auksoriūtė, Núria Bel, António Branco, Gerhard Budin, Walter Daelemans, Koenraad De Smedt, Radovan Garabík, Maria Gavriilidou, Dagmar Gromann, Svetla Koeva, Simon Krek, Cvetana Krstev, Krister Lindén, Bernardo Magnini, Jan Odijk, Maciej Ogrodniczuk, Eiríkur Rögnvaldsson, Mike Rosner, Bolette Sandford Pedersen, Inguna Skadiņa, Marko Tadić, Dan Tufiş, Tamás Váradi, Kadri Vider, Andy Way, François Yvon

Multilingualism is a cultural cornerstone of Europe and firmly anchored in the European treaties including full language equality.

Misconceptions

Paper
Add Code

European Language Grid: An Overview

no code implementations • LREC 2020 • Georg Rehm, Maria Berger, Ela Elsholz, Stefanie Hegele, Florian Kintzel, Katrin Marheinecke, Stelios Piperidis, Miltos Deligiannis, Dimitris Galanis, Katerina Gkirtzou, Penny Labropoulou, Kalina Bontcheva, David Jones, Ian Roberts, Jan Hajic, Jana Hamrlová, Lukáš Kačena, Khalid Choukri, Victoria Arranz, Andrejs Vasiļjevs, Orians Anvari, Andis Lagzdiņš, Jūlija Meļņika, Gerhard Backfried, Erinç Dikici, Miroslav Janosik, Katja Prinz, Christoph Prinz, Severin Stampler, Dorothea Thomas-Aniola, José Manuel Gómez Pérez, Andres Garcia Silva, Christian Berrío, Ulrich Germann, Steve Renals, Ondrej Klejch

With 24 official EU and many additional languages, multilingualism in Europe and an inclusive Digital Single Market can only be enabled through Language Technologies (LTs).

Paper
Add Code

Journalist-in-the-Loop: Continuous Learning as a Service for Rumour Analysis

no code implementations • IJCNLP 2019 • Twin Karmakharm, Nikolaos Aletras, Kalina Bontcheva

The system features a rumour annotation service that allows journalists to easily provide feedback for a given social media post through a web-based interface.

Rumour Detection

Paper
Add Code

The evolution of argumentation mining: From models to social media and emerging tools

no code implementations • 4 Jul 2019 • Anastasios Lytos, Thomas Lagkas, Panagiotis Sarigiannidis, Kalina Bontcheva

In this survey article, we bridge the gap between theoretical approaches of argumentation mining and pragmatic schemes that satisfy the needs of social media generated data, recognizing the need for adapting more flexible and expandable schemes, capable to adjust to the argumentation conditions that exist in social media.

Paper
Add Code

Team Bertha von Suttner at SemEval-2019 Task 4: Hyperpartisan News Detection using ELMo Sentence Representation Convolutional Network

no code implementations • SEMEVAL 2019 • Ye Jiang, Johann Petrak, Xingyi Song, Kalina Bontcheva, Diana Maynard

This paper describes the participation of team {``}bertha-von-suttner{''} in the SemEval2019 task 4 Hyperpartisan News Detection task.

Sentence Word Embeddings

Paper
Add Code

SemEval-2019 Task 7: RumourEval, Determining Rumour Veracity and Support for Rumours

no code implementations • SEMEVAL 2019 • Genevieve Gorrell, Elena Kochkina, Maria Liakata, Ahmet Aker, Arkaitz Zubiaga, Kalina Bontcheva, Leon Derczynski

Rumour verification is characterised by the need to consider evolving conversations and news updates to reach a verdict on a rumour{'}s veracity.

Rumour Detection

Paper
Add Code

RumourEval 2019: Determining Rumour Veracity and Support for Rumours

no code implementations • 18 Sep 2018 • Genevieve Gorrell, Kalina Bontcheva, Leon Derczynski, Elena Kochkina, Maria Liakata, Arkaitz Zubiaga

This is the proposal for RumourEval-2019, which will run in early 2019 as part of that year's SemEval event.

Rumour Detection

Paper
Add Code

Can Rumour Stance Alone Predict Veracity?

no code implementations • COLING 2018 • Sebastian Dungs, Ahmet Aker, Norbert Fuhr, Kalina Bontcheva

Prior manual studies of rumours suggested that crowd stance can give insights into the actual rumour veracity.

General Classification Rumour Detection +1

Paper
Add Code

Helping Crisis Responders Find the Informative Needle in the Tweet Haystack

1 code implementation • 29 Jan 2018 • Leon Derczynski, Kenny Meesters, Kalina Bontcheva, Diana Maynard

Messages are filtered for informativeness based on a definition of the concept drawn from prior research and crisis response experts.

General Classification Informativeness

Paper
Code

Discourse-Aware Rumour Stance Classification in Social Media Using Sequential Classifiers

no code implementations • 6 Dec 2017 • Arkaitz Zubiaga, Elena Kochkina, Maria Liakata, Rob Procter, Michal Lukasik, Kalina Bontcheva, Trevor Cohn, Isabelle Augenstein

We show that sequential classifiers that exploit the use of discourse properties in social media conversations while using only local features, outperform non-sequential classifiers.

General Classification Stance Classification

Paper
Add Code

Simple Open Stance Classification for Rumour Analysis

2 code implementations • RANLP 2017 • Ahmet Aker, Leon Derczynski, Kalina Bontcheva

Stance classification determines the attitude, or stance, in a (typically short) text.

Classification General Classification +2

Paper
Code

Automatic Summarization of Online Debates

no code implementations • RANLP 2017 • Nattapong Sanchan, Ahmet Aker, Kalina Bontcheva

In our work, we investigate two different clustering approaches for the generation of the summaries.

Clustering Text Summarization

Paper
Add Code

Gold Standard Online Debates Summaries and First Experiments Towards Automatic Summarization of Online Debate Data

no code implementations • 15 Aug 2017 • Nattapong Sanchan, Ahmet Aker, Kalina Bontcheva

In this paper, we collected and annotated debate data for an automatic summarization task.

Extractive Summarization Information Retrieval +4

Paper
Add Code

SemEval-2017 Task 8: RumourEval: Determining rumour veracity and support for rumours

no code implementations • SEMEVAL 2017 • Leon Derczynski, Kalina Bontcheva, Maria Liakata, Rob Procter, Geraldine Wong Sak Hoi, Arkaitz Zubiaga

Media is full of false claims.

Rumour Detection

Paper
Add Code

Detection and Resolution of Rumours in Social Media: A Survey

no code implementations • 3 Apr 2017 • Arkaitz Zubiaga, Ahmet Aker, Kalina Bontcheva, Maria Liakata, Rob Procter

Despite the increasing use of social media platforms for information and news gathering, its unmoderated nature often leads to the emergence and spread of rumours, i. e. pieces of information that are unverified at the time of posting.

Classification General Classification +3

Paper
Add Code

Generalisation in Named Entity Recognition: A Quantitative Analysis

no code implementations • 11 Jan 2017 • Isabelle Augenstein, Leon Derczynski, Kalina Bontcheva

Unseen NEs, in particular, play an important role, which have a higher incidence in diverse genres such as social media than in more regular genres such as newswire.

named-entity-recognition Named Entity Recognition +1

Paper
Add Code

Broad Twitter Corpus: A Diverse Named Entity Recognition Resource

no code implementations • COLING 2016 • Leon Derczynski, Kalina Bontcheva, Ian Roberts

One of the main obstacles, hampering method development and comparative evaluation of named entity recognition in social media, is the lack of a sizeable, diverse, high quality annotated corpus, analogous to the CoNLL{'}2003 news dataset.

named-entity-recognition Named Entity Recognition +1

Paper
Add Code

User profiling with geo-located posts and demographic data

no code implementations • WS 2016 • Adam Poulston, Mark Stevenson, Kalina Bontcheva

Paper
Add Code

Using Gaussian Processes for Rumour Stance Classification in Social Media

no code implementations • 7 Sep 2016 • Michal Lukasik, Kalina Bontcheva, Trevor Cohn, Arkaitz Zubiaga, Maria Liakata, Rob Procter

Social media tend to be rife with rumours while new reports are released piecemeal during breaking news.

Gaussian Processes General Classification +3

Paper
Add Code

Hawkes Processes for Continuous Time Sequence Classification: an Application to Rumour Stance Classification in Twitter

no code implementations • ACL 2016 • Michal Lukasik, P. K. Srijith, Duy Vu, Kalina Bontcheva, Arkaitz Zubiaga, Trevor Cohn

Classification General Classification +3

Paper
Add Code

Stance Detection with Bidirectional Conditional Encoding

1 code implementation • EMNLP 2016 • Isabelle Augenstein, Tim Rocktäschel, Andreas Vlachos, Kalina Bontcheva

Stance detection is the task of classifying the attitude expressed in a text towards a target such as Hillary Clinton to be "positive", negative" or "neutral".

Stance Detection

Paper
Code

USFD at SemEval-2016 Task 6: Any-Target Stance Detection on Twitter with Autoencoders

no code implementations • SEMEVAL 2016 • Isabelle Augenstein, Andreas Vlachos, Kalina Bontcheva

Natural Language Inference Sentiment Analysis +1

Paper
Add Code

Monolingual Social Media Datasets for Detecting Contradiction and Entailment

no code implementations • LREC 2016 • Piroska Lendvai, Isabelle Augenstein, Kalina Bontcheva, Thierry Declerck

Entailment recognition approaches are useful for application domains such as information extraction, question answering or summarisation, for which evidence from multiple sentences needs to be combined.

Natural Language Inference Question Answering +1

Paper
Add Code

Challenges of Evaluating Sentiment Analysis Tools on Social Media

no code implementations • LREC 2016 • Diana Maynard, Kalina Bontcheva

This paper discusses the challenges in carrying out fair comparative evaluations of sentiment analysis systems.

Sentiment Analysis

Paper
Add Code

USFD: Twitter NER with Drift Compensation and Linked Data

no code implementations • WS 2015 • Leon Derczynski, Isabelle Augenstein, Kalina Bontcheva

This paper describes a pilot NER system for Twitter, comprising the USFD system entry to the W-NUT 2015 NER shared task.

Clustering NER

Paper
Add Code

Efficient Named Entity Annotation through Pre-empting

no code implementations • RANLP 2015 • Leon Derczynski, Kalina Bontcheva

Active Learning Named Entity Recognition (NER)

Paper
Add Code

Modeling Tweet Arrival Times using Log-Gaussian Cox Processes

no code implementations • EMNLP 2015 • Michal Lukasik, P. K. Srijith, Trevor Cohn, Kalina Bontcheva

Rumour Detection Time Series Analysis

Paper
Add Code

Point Process Modelling of Rumour Dynamics in Social Media

no code implementations • IJCNLP 2015 • Michal Lukasik, Trevor Cohn, Kalina Bontcheva

Epidemiology Multi-Task Learning +2

Paper
Add Code

Classifying Tweet Level Judgements of Rumours in Social Media

no code implementations • EMNLP 2015 • Michal Lukasik, Trevor Cohn, Kalina Bontcheva

Social media is a rich source of rumours and corresponding community reactions.

Multi-Task Learning Rumour Detection +1

Paper
Add Code

Analysis of Named Entity Recognition and Linking for Tweets

no code implementations • 27 Oct 2014 • Leon Derczynski, Diana Maynard, Giuseppe Rizzo, Marieke van Erp, Genevieve Gorrell, Raphaël Troncy, Johann Petrak, Kalina Bontcheva

Applying natural language processing for mining and intelligent information access to tweets (a form of microblog) is a challenging, emerging research area.

Entity Disambiguation Language Identification +4

Paper
Add Code

Corpus Annotation through Crowdsourcing: Towards Best Practice Guidelines

no code implementations • LREC 2014 • Marta Sabou, Kalina Bontcheva, Leon Derczynski, Arno Scharl

Crowdsourcing is an emerging collaborative approach that can be used for the acquisition of annotated corpora and a wide range of other linguistic resources.

Domain Adaptation Natural Language Inference +3