Learning Representations for Detecting Abusive Language

WS 2018  ·  Magnus Sahlgren, Tim Isbister, Fredrik Olsson ·

This paper discusses the question whether it is possible to learn a generic representation that is useful for detecting various types of abusive language. The approach is inspired by recent advances in transfer learning and word embeddings, and we learn representations from two different datasets containing various degrees of abusive language. We compare the learned representation with two standard approaches; one based on lexica, and one based on data-specific $n$-grams. Our experiments show that learned representations \textit{do} contain useful information that can be used to improve detection performance when training data is limited.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here