no code implementations • LREC 2022 • Ubaid Azam, Hammad Rizwan, Asim Karim
In this paper, we explore different data augmentation techniques for the improvement of hate speech detection in Roman Urdu.
no code implementations • EMNLP 2020 • Hammad Rizwan, Muhammad Haroon Shakeel, Asim Karim
The task of automatic hate-speech and offensive language detection in social media content is of utmost importance due to its implications in unprejudiced society concerning race, gender, or religion.
1 code implementation • 31 Mar 2020 • Abdul Rafae Khan, Asim Karim, Hassan Sajjad, Faisal Kamiran, Jia Xu
Roman Urdu is an informal form of the Urdu language written in Roman script, which is widely used in South Asia for online textual content.
1 code implementation • 4 Jan 2020 • Muhammad Haroon Shakeel, Asim Karim
Such informal and code-switched content are under-resourced in terms of labeled datasets and language models even for popular tasks like sentiment classification.
no code implementations • 27 Dec 2019 • Muhammad Haroon Shakeel, Asim Karim, Imdadullah Khan
In this work, we present a data augmentation strategy and a multi-cascaded model for improved paraphrase detection in short texts.
1 code implementation • 29 Nov 2019 • Muhammad Haroon Shakeel, Asim Karim, Imdadullah Khan
Our model achieves high accuracy for classification on this dataset and outperforms the previous model for multilingual text classification, highlighting language independence of McM.
no code implementations • 27 Dec 2017 • Salman Ahmad Ansari, Usman Zafar, Asim Karim
This approach is motivated by the observation that text normalization is essentially a matching problem and nearest neighbor matching with an adaptive similarity function is the most direct procedure for it.