Texts

HatEval (SemEval 2019 Task 5: Multilingual Detection of Hate Speech Against Immigrants and Women in Twitter)

Introduced by Basile et al. in SemEval-2019 Task 5: Multilingual Detection of Hate Speech Against Immigrants and Women in Twitter

Hate Speech is commonly defined as any communication that disparages a person or a group on the basis of some characteristic such as race, color, ethnicity, gender, sexual orientation, nationality, religion, or other characteristics. Given the huge amount of user-generated contents on the Web, and in particular on social media, the problem of detecting, and therefore possibly limit the Hate Speech diffusion, is becoming fundamental, for instance for fighting against misogyny and xenophobia.

The proposed task consists in Hate Speech detection in Twitter but featured by two specific different targets, immigrants and women, in a multilingual perspective, for Spanish and English. The task will be articulated around two related subtasks for each of the involved languages: a basic task about Hate Speech, and another one where fine-grained features of hateful contents will be investigated in order to understand how existing approaches may deal with the identification of especially dangerous forms of hate, i.e. those where the incitement is against an individual rather than against a group of people, and where an aggressive behavior of the author can be identified as a prominent feature of the expression of hate. Participants will be asked to identify, on the one hand, if the target of hate is a single human or a group of persons, on the other hand, if the message author intends to be aggressive, harmful, or even to incite, in various forms, to violent acts against the target.

TASK A - Hate Speech Detection against Immigrants and Women: a two-class (or binary) classification where systems have to predict whether a tweet in English or in Spanish with a given target (women or immigrants) is hateful or not hateful.
TASK B - Aggressive behavior and Target Classification: where systems are asked first to classify hateful tweets for English and Spanish (e.g., tweets where Hate Speech against women or immigrants has been identified) as aggressive or not aggressive, and second to identify the target harassed as individual or generic (i.e. single human or group).

Homepage