no code implementations • 4 Jun 2022 • Juuso Eronen, Michal Ptaszynski, Fumito Masui, Gniewosz Leliwa, Michal Wroczynski, Mateusz Piech, Aleksander Smywinski-Pohl
In this research, we study the change in the performance of machine learning (ML) classifiers when various linguistic preprocessing methods of a dataset were used, with the specific focus on linguistically-backed embeddings in Convolutional Neural Networks (CNN).
no code implementations • 2 Jun 2022 • Juuso Eronen, Michal Ptaszynski, Fumito Masui, Masaki Arata, Gniewosz Leliwa, Michal Wroczynski
We study the selection of transfer languages for automatic abusive language detection.
no code implementations • 2 Nov 2021 • Juuso Eronen, Michal Ptaszynski, Fumito Masui, Aleksander Smywiński-Pohl, Gniewosz Leliwa, Michal Wroczynski
We study the effectiveness of Feature Density (FD) using different linguistically-backed feature preprocessing methods in order to estimate dataset complexity, which in turn is used to comparatively estimate the potential performance of machine learning (ML) classifiers prior to any training.
no code implementations • 2 Aug 2018 • Michał Ptaszyński, Gniewosz Leliwa, Mateusz Piech, Aleksander Smywiński-Pohl
There are two goals to achieve: building a gold standard cyberbullying detection dataset and measuring the performance of the Samurai cyberbullying detection system.