Search Results for author: Aloka Fernando

Found 4 papers, 1 papers with code

Building a Linguistic Resource : A Word Frequency List for Sinhala

no code implementations ICON 2021 Aloka Fernando, Gihan Dias

The word frequency list and the verified word list are the largest collections of words lists that are available for the Sinhala language.

POS

Quality Does Matter: A Detailed Look at the Quality and Utility of Web-Mined Parallel Corpora

1 code implementation12 Feb 2024 Surangika Ranathunga, Nisansa de Silva, Menan Velayuthan, Aloka Fernando, Charitha Rathnayake

We conducted a detailed analysis on the quality of web-mined corpora for two low-resource languages (making three language pairs, English-Sinhala, English-Tamil and Sinhala-Tamil).

Machine Translation NMT +1

Data Augmentation to Address Out-of-Vocabulary Problem in Low-Resource Sinhala-English Neural Machine Translation

no code implementations18 May 2022 Aloka Fernando, Surangika Ranathunga

However, existing DA techniques have addressed only one of these OOV types and limit to considering either syntactic constraints or semantic constraints.

Data Augmentation Machine Translation +1

Data Augmentation and Terminology Integration for Domain-Specific Sinhala-English-Tamil Statistical Machine Translation

no code implementations5 Nov 2020 Aloka Fernando, Surangika Ranathunga, Gihan Dias

This paper focuses on data augmentation techniques where bilingual lexicon terms are expanded based on case-markers with the objective of generating new words, to be used in Statistical machine Translation (SMT).

Data Augmentation Machine Translation +1

Cannot find the paper you are looking for? You can Submit a new open access paper.