Hate Speech Detection

169 papers with code • 14 benchmarks • 40 datasets

Hate speech detection is the task of detecting if communication such as text, audio, and so on contains hatred and or encourages violence towards a person or a group of people. This is usually based on prejudice against 'protected characteristics' such as their ethnicity, gender, sexual orientation, religion, age et al. Some example benchmarks are ETHOS and HateXplain. Models can be evaluated with metrics like the F-score or F-measure.

Libraries

Use these libraries to find Hate Speech Detection models and implementations

Latest papers with no code

An Investigation of Large Language Models for Real-World Hate Speech Detection

no code yet • 7 Jan 2024

Our study reveals that a meticulously crafted reasoning prompt can effectively capture the context of hate speech by fully utilizing the knowledge base in LLMs, significantly outperforming existing techniques.

HCDIR: End-to-end Hate Context Detection, and Intensity Reduction model for online comments

no code yet • 20 Dec 2023

We masked the 50\% hateful words of the comments identified as hateful and predicted the alternative words for these masked terms to generate convincing sentences.

Hate Speech and Offensive Content Detection in Indo-Aryan Languages: A Battle of LSTM and Transformers

no code yet • 9 Dec 2023

Our study encompasses a wide range of pre-trained models, including Bert variants, XLM-R, and LSTM models, to assess their performance in identifying hate speech across these languages.

Contextualizing Internet Memes Across Social Media Platforms

no code yet • 18 Nov 2023

Internet memes have emerged as a novel format for communication and expressing ideas on the web.

Generative AI for Hate Speech Detection: Evaluation and Findings

no code yet • 16 Nov 2023

In addition, we explore and compare the performance of the finetuned LLMs with zero-shot hate detection using a GPT-3. 5 model.

LLMs and Finetuning: Benchmarking cross-domain performance for hate speech detection

no code yet • 29 Oct 2023

To answer (2), we assessed the performance of 288 out-of-domain classifiers for a given end-domain dataset.

Break it, Imitate it, Fix it: Robustness by Generating Human-Like Attacks

no code yet • 25 Oct 2023

We demonstrate the advantages of this system on the ANLI and hate speech detection benchmark datasets - both collected via an iterative, adversarial human-and-model-in-the-loop procedure.

GASCOM: Graph-based Attentive Semantic Context Modeling for Online Conversation Understanding

no code yet • 21 Oct 2023

Specifically, we design two novel algorithms that utilise both the graph structure of the online conversation as well as the semantic information from individual posts for retrieving relevant context nodes from the whole conversation.

Probing LLMs for hate speech detection: strengths and vulnerabilities

no code yet • 19 Oct 2023

Recently efforts have been made by social media platforms as well as researchers to detect hateful or toxic language using large language models.

Hate Speech Detection in Limited Data Contexts using Synthetic Data Generation

no code yet • 4 Oct 2023

In this work, we propose a data augmentation approach that addresses the problem of lack of data for online hate speech detection in limited data contexts using synthetic data generation techniques.