JESC (Japanese-English Subtitle Corpus)

Introduced by Pryzant et al. in JESC: Japanese-English Subtitle Corpus

Japanese-English Subtitle Corpus is a large Japanese-English parallel corpus covering the underrepresented domain of conversational dialogue. It consists of more than 3.2 million examples, making it the largest freely available dataset of its kind. The corpus was assembled by crawling and aligning subtitles found on the web.

Source: JESC: Japanese-English Subtitle Corpus

Papers


Paper Code Results Date Stars

Dataset Loaders


No data loaders found. You can submit your data loader here.

Tasks


Similar Datasets


License


Modalities


Languages