Kor-Lang8 is a Korean grammatical error correction (GEC) dataset extracted from the NAIST Lang-8 Learner Corpora by the language label. It contains more than 109K sentence pairs.
1 PAPER • NO BENCHMARKS YET
Kor-Learner is a Korean grammatical error correction (GEC) dataset made from the NIKL learner corpus containing essays written by Korean learners and their grammatical error correction annotations by their tutors in an morpheme-level XML file format. It contains more than 28K sentence pairs.
Kor-Learner is a Korean grammatical error correction (GEC) dataset collected grammatically from two sources, and the correct sentences were read using Google Text-to-Speech(TTS) system. The general public was tasked with dictating grammatically correct sentences and transcribe them. It contains more than 17K sentence pairs.