Generating a Lexicon of Errors in Portuguese to Support an Error Identification System for Spanish Native Learners

LREC 2014 · Lianet Sep{\'u}lveda Torres, Magali Sanches Duran, S Alu{\'\i}sio, ra ·

Portuguese is a less resourced language in what concerns foreign language learning. Aiming to inform a module of a system designed to support scientific written production of Spanish native speakers learning Portuguese, we developed an approach to automatically generate a lexicon of wrong words, reproducing language transfer errors made by such foreign learners. Each item of the artificially generated lexicon contains, besides the wrong word, the respective Spanish and Portuguese correct words. The wrong word is used to identify the interlanguage error and the correct Spanish and Portuguese forms are used to generate the suggestions. Keeping control of the correct word forms, we can provide correction or, at least, useful suggestions for the learners. We propose to combine two automatic procedures to obtain the error correction: i) a similarity measure and ii) a translation algorithm based on aligned parallel corpus. The similarity-based method achieved a precision of 52{\%}, whereas the alignment-based method achieved a precision of 90{\%}. In this paper we focus only on interlanguage errors involving suffixes that have different forms in both languages. The approach, however, is very promising to tackle other types of errors, such as gender errors.

PDF Abstract

Code

Add Remove Mark official

No code implementations yet. Submit your code now

Tasks

Add Remove

Translation

Datasets

Add Datasets introduced or used in this paper

Results from the Paper

Add Remove

Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

Generating a Lexicon of Errors in Portuguese to Support an Error Identification System for Spanish Native Learners

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit Add Remove

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Add Remove

Methods

Add Remove