Offensive Language Detection Using Brown Clustering

LREC 2020  ·  Zuoyu Tian, S K{\"u}bler, ra ·

In this study, we investigate the use of Brown clustering for offensive language detection. Brown clustering has been shown to be of little use when the task involves distinguishing word polarity in sentiment analysis tasks. In contrast to previous work, we train Brown clusters separately on positive and negative sentiment data, but then combine the information into a single complex feature per word. This way of representing words results in stable improvements in offensive language detection, when used as the only features or in combination with words or character n-grams. Brown clusters add important information, even when combined with words or character n-grams or with standard word embeddings in a convolutional neural network. However, we also found different trends between the two offensive language data sets we used.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here