DLRG@DravidianLangTech-EACL2021: Transformer based approachfor Offensive Language Identification on Code-Mixed Tamil

Internet advancements have made a huge impact on the communication pattern of people and their life style. People express their opinion on products, politics, movies etc. in social media. Even though, English is predominantly used, nowadays many people prefer to tweet in their native language and some- times by combining it with English. Sentiment analysis on such code-mixed tweets is challenging, due to large vocabulary, grammar and colloquial usage of many words. In this paper, the transformer based language model is applied to analyse the sentiment on Tanglish tweets, which is a combination of Tamil and English. This work has been submitted to the the shared task on DravidianLangTech- EACL2021. From the experimental results, it is shown that an F 1 score of 64% was achieved in detecting the hate speech in code-mixed Tamil-English tweets using bidirectional trans- former model.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here