no code implementations • 30 Aug 2023 • Tanjim Mahmud, Michal Ptaszynski, Juuso Eronen, Fumito Masui
Based on recognizing those research gaps, we provide some suggestions for improving the general research conduct in cyberbullying detection, with a primary focus on low-resource languages.
no code implementations • 1 Jun 2023 • Juuso Eronen, Michal Ptaszynski, Karol Nowakowski, Zheng Lin Chia, Fumito Masui
This paper investigates the impact of data volume and the use of similar languages on transfer learning in a machine translation task.
no code implementations • 31 Jan 2023 • Juuso Eronen, Michal Ptaszynski, Fumito Masui
This allows us to select a more suitable transfer language which can be used to better leverage knowledge from high-resource languages in order to improve the performance of language applications lacking data.
no code implementations • 4 Jun 2022 • Juuso Eronen, Michal Ptaszynski, Fumito Masui
In most cases, word embeddings are learned only from raw tokens or in some cases, lemmas.
no code implementations • 4 Jun 2022 • Juuso Eronen, Michal Ptaszynski, Fumito Masui, Gniewosz Leliwa, Michal Wroczynski, Mateusz Piech, Aleksander Smywinski-Pohl
In this research, we study the change in the performance of machine learning (ML) classifiers when various linguistic preprocessing methods of a dataset were used, with the specific focus on linguistically-backed embeddings in Convolutional Neural Networks (CNN).
no code implementations • 2 Jun 2022 • Juuso Eronen, Michal Ptaszynski, Fumito Masui, Masaki Arata, Gniewosz Leliwa, Michal Wroczynski
We study the selection of transfer languages for automatic abusive language detection.
no code implementations • 2 Nov 2021 • Juuso Eronen, Michal Ptaszynski, Fumito Masui, Aleksander Smywiński-Pohl, Gniewosz Leliwa, Michal Wroczynski
We study the effectiveness of Feature Density (FD) using different linguistically-backed feature preprocessing methods in order to estimate dataset complexity, which in turn is used to comparatively estimate the potential performance of machine learning (ML) classifiers prior to any training.