Hierarchical Transformer Encoders for Vietnamese Spelling Correction

28 May 2021  ·  Hieu Tran, Cuong V. Dinh, Long Phan, Son T. Nguyen ·

In this paper, we propose a Hierarchical Transformer model for Vietnamese spelling correction problem. The model consists of multiple Transformer encoders and utilizes both character-level and word-level to detect errors and make corrections. In addition, to facilitate future work in Vietnamese spelling correction tasks, we propose a realistic dataset collected from real-life texts for the problem. We compare our method with other methods and publicly available systems. The proposed method outperforms all of the contemporary methods in terms of recall, precision, and f1-score. A demo version is publicly available.

PDF Abstract

Datasets


Introduced in the Paper:

Viwiki-Spelling

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods