T\"eXtmarkers at SemEval-2020 Task 10: Emphasis Selection with Agreement Dependent Crowd Layers
In visual communication, the ability of a short piece of text to catch someone{'}s eye in a single glance or from a distance is of paramount importance. In our approach to the SemEval-2020 task {``}Emphasis Selection For Written Text in Visual Media{''}, we use contextualized word representations from a pretrained model of the state-of-the-art BERT architecture together with a stacked bidirectional GRU network to predict token-level emphasis probabilities. For tackling low inter-annotator agreement in the dataset, we attempt to model multiple annotators jointly by introducing initialization with agreement dependent noise to a crowd layer architecture. We found our approach to both perform substantially better than initialization with identities for this purpose and to outperform a baseline trained with token level majority voting. Our submission system reaches substantially higher Match m on the development set than the task baseline (0.779), but only slightly outperforms the test set baseline (0.754) using a three model ensemble.
PDF Abstract