SeqL at SemEval-2022 Task 11: An Ensemble of Transformer Based Models for Complex Named Entity Recognition Task

SemEval (NAACL) 2022 · Fadi Hassan, Wondimagegnhue Tufa, Guillem Collell, Piek Vossen, Lisa Beinborn, Adrian Flanagan, Kuan Eeik Tan ·

This paper presents our system used to participate in task 11 (MultiCONER) of the SemEval 2022 competition. Our system ranked fourth place in track 12 (Multilingual) and fifth place in track 13 (Code-Mixed). The goal of track 12 is to detect complex named entities in a multilingual setting, while track 13 is dedicated to detecting complex named entities in a code-mixed setting. Both systems were developed using transformer-based language models. We used an ensemble of XLM-RoBERTa-large and Microsoft/infoxlm-large with a Conditional Random Field (CRF) layer. In addition, we describe the algorithms employed to train our models and our hyper-parameter selection. We furthermore study the impact of different methods to aggregate the outputs of the individual models that compose our ensemble. Finally, we present an extensive analysis of the results and errors.

PDF Abstract