A little goes a long way: Improving toxic language classification despite data scarcity

25 Sep 2020 Mika Juuti Tommi Gröndahl Adrian Flanagan N. Asokan

Detection of some types of toxic language is hampered by extreme scarcity of labeled training data. Data augmentation - generating new synthetic data from a labeled seed dataset - can help... (read more)

PDF Abstract

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods used in the Paper