Hate a Little Less, Love a Little More! Proactively Curbing Online Hatred via Hate Speech Normalization

ACL ARR October 2021 · Anonymous ·

Curbing online hate speech has become the need of the hour; however, a blanket ban on such activities is infeasible due to several political, geographical, and cultural reasons. To reduce the severity of the problem, in this paper, we introduce a novel task, hate speech normalization – weakening the intensity of hatred exhibited by an online post by paraphrasing the original content. The intention of hate speech normalization is to not support hate but instead, provide the users with a stepping stone towards non-hate while giving online platforms more time to monitor any improvement in the user’s behaviour. To this end, we manually curated a parallel corpus of hate texts and their normalized counterparts (a normalized text is less hateful and more benign). We then introduce NACL, a Neural hAte speeCh normaLizer that operates in three stages – first, it measures the hate intensity of the original sample; second, it identifies the harmful span(s) within it; and finally, it reduces hate intensity by paraphrasing the hate spans. We perform extensive experiments to measure the efficacy of individual components and the overall performance of NACL via three-way evaluation (intrinsic, extrinsic, and human-study). We observe that NACL outperforms its respective baselines – NACL yields a score of 0.683 Pearson correlation for the intensity prediction, 0.6911 F1-score in the span identification, and 67.71 BLEU and 75.83 perplexity for the normalized text generation. We further show the generalizability of NACL across other platforms (Reddit, Facebook, Gab). A scalable prototype of NACL was also deployed for the user study.

PDF Abstract