3 dataset results for Grammatical Error Correction AND Texts AND Korean

Kor-Lang8 is a Korean grammatical error correction (GEC) dataset extracted from the NAIST Lang-8 Learner Corpora by the language label. It contains more than 109K sentence pairs.

1 PAPER • NO BENCHMARKS YET

Kor-Learner (Korean Learner Corpus)

Kor-Learner is a Korean grammatical error correction (GEC) dataset made from the NIKL learner corpus containing essays written by Korean learners and their grammatical error correction annotations by their tutors in an morpheme-level XML file format. It contains more than 28K sentence pairs.

1 PAPER • NO BENCHMARKS YET

Kor-Native (Native Korean Corpus)

Kor-Learner is a Korean grammatical error correction (GEC) dataset collected grammatically from two sources, and the correct sentences were read using Google Text-to-Speech(TTS) system. The general public was tasked with dictating grammatically correct sentences and transcribe them. It contains more than 17K sentence pairs.

1 PAPER • NO BENCHMARKS YET

Datasets

3 dataset results for Grammatical Error Correction AND Texts AND Korean