Noisy Channel for Low Resource Grammatical Error Correction

WS 2019 · Simon Flachs, Oph{\'e}lie Lacroix, Anders S{\o}gaard ·

This paper describes our contribution to the low-resource track of the BEA 2019 shared task on Grammatical Error Correction (GEC). Our approach to GEC builds on the theory of the noisy channel by combining a channel model and language model. We generate confusion sets from the Wikipedia edit history and use the frequencies of edits to estimate the channel model. Additionally, we use two pre-trained language models: 1) Google{'}s BERT model, which we fine-tune for specific error types and 2) OpenAI{'}s GPT-2 model, utilizing that it can operate with previous sentences as context. Furthermore, we search for the optimal combinations of corrections using beam search.

PDF Abstract