TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Hate Speech Detection	HateXplain	BERT-HateXplain [Attn]	AUROC	0.851	# 3
Hate Speech Detection	HateXplain	BERT-HateXplain [Attn]	Accuracy	0.698	# 3
Hate Speech Detection	HateXplain	BERT-HateXplain [Attn]	Macro F1	0.687	# 3
Hate Speech Detection	HateXplain	BERT-HateXplain [LIME]	AUROC	0.851	# 3
Hate Speech Detection	HateXplain	BERT-HateXplain [LIME]	Macro F1	0.687	# 3
Hate Speech Detection	HateXplain	BERT [Attn]	AUROC	0.843	# 5
Hate Speech Detection	HateXplain	BERT [Attn]	Accuracy	0.69	# 4
Hate Speech Detection	HateXplain	BERT [Attn]	Macro F1	0.674	# 5
Hate Speech Detection	HateXplain	BiRNN-HateXplain [Attn]	AUROC	0.805	# 6
Hate Speech Detection	HateXplain	BiRNN-HateXplain [Attn]	Macro F1	0.629	# 6
Hate Speech Detection	HateXplain	BiRNN-Attn [Attn]	AUROC	0.795	# 7
Hate Speech Detection	HateXplain	BiRNN-Attn [Attn]	Accuracy	0.621	# 6
Hate Speech Detection	HateXplain	BiRNN [LIME]	AUROC	0.767	# 9
Hate Speech Detection	HateXplain	BiRNN [LIME]	Accuracy	0.595	# 7
Hate Speech Detection	HateXplain	BiRNN [LIME]	Macro F1	0.575	# 8
Hate Speech Detection	HateXplain	CNN-GRU [LIME]	AUROC	0.793	# 8
Hate Speech Detection	HateXplain	CNN-GRU [LIME]	Accuracy	0.629	# 5
Hate Speech Detection	HateXplain	CNN-GRU [LIME]	Macro F1	0.614	# 7

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/hatexplain-a-benchmark-dataset-for/hate-speech-detection-on-hatexplain)](https://paperswithcode.com/sota/hate-speech-detection-on-hatexplain?p=hatexplain-a-benchmark-dataset-for)`

HateXplain: A Benchmark Dataset for Explainable Hate Speech Detection

18 Dec 2020 · Binny Mathew, Punyajoy Saha, Seid Muhie Yimam, Chris Biemann, Pawan Goyal, Animesh Mukherjee ·

Hate speech is a challenging issue plaguing the online social media. While better models for hate speech detection are continuously being developed, there is little research on the bias and interpretability aspects of hate speech. In this paper, we introduce HateXplain, the first benchmark hate speech dataset covering multiple aspects of the issue. Each post in our dataset is annotated from three different perspectives: the basic, commonly used 3-class classification (i.e., hate, offensive or normal), the target community (i.e., the community that has been the victim of hate speech/offensive speech in the post), and the rationales, i.e., the portions of the post on which their labelling decision (as hate, offensive or normal) is based. We utilize existing state-of-the-art models and observe that even models that perform very well in classification do not score high on explainability metrics like model plausibility and faithfulness. We also observe that models, which utilize the human rationales for training, perform better in reducing unintended bias towards target communities. We have made our code and dataset public at https://github.com/punyajoy/HateXplain

PDF Abstract

Code

Add Remove Mark official

punyajoy/HateXplain official

175

hate-alert/HateXplain

175

jinhxu/how-much-hate-with-china

↳ Quickstart in

Colab

Onepierre/Hate_Speech_Detection

darsh10/HateXplain-Darsh

See all 6 implementations

Tasks

Add Remove

Hate Speech Detection

Text Classification

Datasets

Introduced in the Paper:

HateXplain

Used in the Paper:

Hate Speech

Results from the Paper

Edit

Ranked #3 on Hate Speech Detection on HateXplain

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Hate Speech Detection	HateXplain	BERT-HateXplain [Attn]	AUROC	0.851	# 3	Compare
			Accuracy	0.698	# 3	Compare
			Macro F1	0.687	# 3	Compare
Hate Speech Detection	HateXplain	BERT-HateXplain [LIME]	AUROC	0.851	# 3	Compare
Hate Speech Detection	HateXplain	BERT-HateXplain [LIME]	Macro F1	0.687	# 3	Compare
Hate Speech Detection	HateXplain	BERT [Attn]	AUROC	0.843	# 5	Compare
			Accuracy	0.69	# 4	Compare
			Macro F1	0.674	# 5	Compare
Hate Speech Detection	HateXplain	BiRNN-HateXplain [Attn]	AUROC	0.805	# 6	Compare
Hate Speech Detection	HateXplain	BiRNN-HateXplain [Attn]	Macro F1	0.629	# 6	Compare
Hate Speech Detection	HateXplain	BiRNN-Attn [Attn]	AUROC	0.795	# 7	Compare
Hate Speech Detection	HateXplain	BiRNN-Attn [Attn]	Accuracy	0.621	# 6	Compare
Hate Speech Detection	HateXplain	BiRNN [LIME]	AUROC	0.767	# 9	Compare
			Accuracy	0.595	# 7	Compare
			Macro F1	0.575	# 8	Compare
Hate Speech Detection	HateXplain	CNN-GRU [LIME]	AUROC	0.793	# 8	Compare
			Accuracy	0.629	# 5	Compare
			Macro F1	0.614	# 7	Compare

Methods

Add Remove

Adam • Attention Dropout • BERT • Dense Connections • Dropout • GELU • Interpretability • Layer Normalization • Linear Layer • Linear Warmup With Linear Decay • Multi-Head Attention • Residual Connection • Scaled Dot-Product Attention • Softmax • Weight Decay • WordPiece

Edit Social Preview

HateXplain: A Benchmark Dataset for Explainable Hate Speech Detection

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove