no code implementations • RANLP 2021 • Maria Kunilovskaya, Alistair Plum
This paper focuses on data cleaning as part of a preprocessing procedure applied to text data retrieved from the web.
no code implementations • 25 Mar 2024 • Alistair Plum, Tharindu Ranasinghe, Christoph Purschke
We also create a manually annotated dataset with 2000 instances to evaluate the models and release it together with the dataset compiled using guided distant supervision.
no code implementations • 2 May 2022 • Alistair Plum, Tharindu Ranasinghe, Spencer Jones, Constantin Orasan, Ruslan Mitkov
The dataset, which is aimed towards digital humanities (DH) and historical research, is automatically compiled by aligning sentences from Wikipedia articles with matching structured data from sources including Pantheon and Wikidata.
no code implementations • SEMEVAL 2020 • Tharindu Ranasinghe, Alistair Plum, Constantin Orasan, Ruslan Mitkov
This paper presents the RGCL team submission to SemEval 2020 Task 6: DeftEval, subtasks 1 and 2.
no code implementations • 13 Oct 2020 • Tharindu Ranasinghe, Alistair Plum, Constantin Orasan, Ruslan Mitkov
This paper presents the RGCL team submission to SemEval 2020 Task 6: DeftEval, subtasks 1 and 2.
no code implementations • RANLP 2019 • Alistair Plum, Tharindu Ranasinghe, Constantin Orasan
This paper compares how different machine learning classifiers can be used together with simple string matching and named entity recognition to detect locations in texts.
no code implementations • SEMEVAL 2019 • Alistair Plum, Tharindu Ranasinghe, Pablo Calleja, Constantin Or{\u{a}}san, Ruslan Mitkov
This article describes the system submitted by the RGCL-WLV team to the SemEval 2019 Task 12: Toponym resolution in scientific papers.