Hate Speech Detection

164 papers with code • 14 benchmarks • 39 datasets

Hate speech detection is the task of detecting if communication such as text, audio, and so on contains hatred and or encourages violence towards a person or a group of people. This is usually based on prejudice against 'protected characteristics' such as their ethnicity, gender, sexual orientation, religion, age et al. Some example benchmarks are ETHOS and HateXplain. Models can be evaluated with metrics like the F-score or F-measure.

Libraries

Use these libraries to find Hate Speech Detection models and implementations

Most implemented papers

DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter

huggingface/transformers NeurIPS 2019

As Transfer Learning from large-scale pre-trained models becomes more prevalent in Natural Language Processing (NLP), operating these large models in on-the-edge and/or under constrained computational training or inference budgets remains challenging.

Automated Hate Speech Detection and the Problem of Offensive Language

t-davidson/hate-speech-and-offensive-language 11 Mar 2017

We train a multi-class classifier to distinguish between these different categories.

OPT: Open Pre-trained Transformer Language Models

facebookresearch/metaseq 2 May 2022

Large language models, which are often trained for hundreds of thousands of compute days, have shown remarkable capabilities for zero- and few-shot learning.

HateXplain: A Benchmark Dataset for Explainable Hate Speech Detection

punyajoy/HateXplain 18 Dec 2020

We also observe that models, which utilize the human rationales for training, perform better in reducing unintended bias towards target communities.

Comparative Studies of Detecting Abusive Language on Twitter

younggns/comparative-abusive-lang WS 2018

However, this dataset has not been comprehensively studied to its potential.

Deep Learning Models for Multilingual Hate Speech Detection

punyajoy/DE-LIMIT 14 Apr 2020

Hate speech detection is a challenging problem with most of the datasets available in only one language: English.

HateCheck: Functional Tests for Hate Speech Detection Models

paul-rottger/hate-functional-tests ACL 2021

Detecting online hate is a difficult task that even state-of-the-art models struggle with.

Detecting Online Hate Speech Using Context Aware Models

sjtuprog/fox-news-comments RANLP 2017

In the wake of a polarizing election, the cyber world is laden with hate speech.

Hate Speech Dataset from a White Supremacy Forum

aitor-garcia-p/hate-speech-dataset WS 2018

Hate speech is commonly defined as any communication that disparages a target group of people based on some characteristic such as race, colour, ethnicity, gender, sexual orientation, nationality, religion, or other characteristic.

Hateminers : Detecting Hate speech against Women

punyajoy/Hateminers-EVALITA 17 Dec 2018

With the online proliferation of hate speech, there is an urgent need for systems that can detect such harmful content.