no code implementations • 19 Apr 2022 • Cagri Toraman, Eyup Halit Yilmaz, Furkan Şahinuç, Oguzhan Ozcelik
Furthermore, we find that increasing the vocabulary size improves the performance of Morphological and Word-level tokenizers more than that of de facto tokenizers.
1 code implementation • LREC 2022 • Cagri Toraman, Furkan Şahinuç, Eyup Halit Yilmaz
The experimental results supported by statistical tests show that Transformer-based language models outperform conventional bag-of-words and neural models by at least 5% in English and 10% in Turkish for large-scale hate speech detection.
no code implementations • 2 Sep 2021 • Eyup Halit Yilmaz, Cagri Toraman
To provide additional information regarding the query and enhance the performance of intent detection, we propose a method for semantic expansion of spoken queries, called ConQX, which utilizes the text generation ability of an auto-regressive language model, GPT-2.