scb-mt-en-th-2020 is an English-Thai machine translation dataset with over 1 million segment pairs, curated from various sources, namely news, Wikipedia articles, SMS messages, task-based dialogs, web-crawled data and government documents.
Source: scb-mt-en-th-2020: A Large English-Thai Parallel CorpusPaper | Code | Results | Date | Stars |
---|