1 code implementation • LREC 2022 • Vinura Dhananjaya, Piyumal Demotte, Surangika Ranathunga, Sanath Jayasena
We test on a set of different Sinhala text classification tasks and our analysis shows that out of the pre-trained multilingual models that include Sinhala (XLM-R, LaBSE, and LASER), XLM-R is the best model by far for Sinhala text classification.
no code implementations • 16 Aug 2022 • Vinura Dhananjaya, Piyumal Demotte, Surangika Ranathunga, Sanath Jayasena
We test on a set of different Sinhala text classification tasks and our analysis shows that out of the pre-trained multilingual models that include Sinhala (XLM-R, LaBSE, and LASER), XLM-R is the best model by far for Sinhala text classification.
no code implementations • 29 Sep 2021 • Mihira Kasun Vithanage, Rukshan Darshana Wijesinghe, Alex Xavier, Dumindu Tissera, Sanath Jayasena, Subha Fernando
In this paper, we present a learning environment where agents are pressured to make their emerging languages compositional by incorporating a metric of topological similarity into the loss function.
no code implementations • 6 Jul 2021 • Dumindu Tissera, Kasun Vithanage, Rukshan Wijesinghe, Alex Xavier, Sanath Jayasena, Subha Fernando, Ranga Rodrigo
The network parameters pose as the parameters of those distributions.
no code implementations • WS 2016 • Fern, S o, areka, Surangika Ranathunga, Sanath Jayasena, Gihan Dias
This paper presents a new comprehensive multi-level Part-Of-Speech tag set and a Support Vector Machine based Part-Of-Speech tagger for the Sinhala language.
no code implementations • WS 2016 • Riyafa Abdul Hameed, Nadeeshani Pathirennehelage, Anusha Ihalapathirana, Maryam Ziyad Mohamed, Surangika Ranathunga, Sanath Jayasena, Gihan Dias, Fern, S o, areka
A sentence aligned parallel corpus is an important prerequisite in statistical machine translation.