EvalD Reference-Less Discourse Evaluation for WMT18

WS 2018 · Ond{\v{r}}ej Bojar, Ji{\v{r}}{\'\i} M{\'\i}rovsk{\'y}, Kate{\v{r}}ina Rysov{\'a}, Magdal{\'e}na Rysov{\'a} ·

We present the results of automatic evaluation of discourse in machine translation (MT) outputs using the EVALD tool. EVALD was originally designed and trained to assess the quality of \textit{human} writing, for native speakers and foreign-language learners. MT has seen a tremendous leap in translation quality at the level of sentences and it is thus interesting to see if the human-level evaluation is becoming relevant.

PDF Abstract