What kinds of errors do reference resolution models make and what can we learn from them?

Findings (NAACL) 2022 · Jorge Sánchez, Mauricio Mazuecos, Hernán Maina, Luciana Benotti ·

Referring resolution is the task of identifying the referent of a natural language expression, for example “the woman behind the other woman getting a massage”. In this paper we investigate which are the kinds of referring expressions on which current transformer based models fail. Motivated by this analysis we identify the weakening of the spatial natural constraints as one of its causes and propose a model that aims to restore it. We evaluate our proposed model on different datasets for the task showing improved performance on the most challenging kinds of referring expressions. Finally we present a thorough analysis of the kinds errors that are improved by the new model and those that are not and remain future challenges for the task.

PDF Abstract