no code implementations • EACL (BSNLP) 2021 • Jakub Piskorski, Bogdan Babych, Zara Kancheva, Olga Kanishcheva, Maria Lebedeva, Michał Marcińczuk, Preslav Nakov, Petya Osenova, Lidia Pivovarova, Senja Pollak, Pavel Přibáň, Ivaylo Radev, Marko Robnik-Sikonja, Vasyl Starko, Josef Steinberger, Roman Yangarber
Seven teams covered all six languages, and five teams participated in the cross-lingual entity linking task.
no code implementations • ACL (CASE) 2021 • Jacek Haneczok, Guillaume Jacquet, Jakub Piskorski, Nicolas Stefanovitch
This paper describes the Shared Task on Fine-grained Event Classification in News-like Text Snippets.
no code implementations • LREC 2022 • Nicolas Stefanovitch, Jakub Piskorski, Sopho Kharazi
The results of various experiments on the performance of both lexicon- and machine learning-based models for Georgian sentiment classification are also reported.
no code implementations • EACL (Hackashop) 2021 • Jakub Piskorski, Nicolas Stefanovitch, Guillaume Jacquet, Aldo Podavini
This paper presents a study of state-of-the-art unsupervised and linguistically unsophisticated keyword extraction algorithms, based on statistic-, graph-, and embedding-based approaches, including, i. a., Total Keyword Frequency, TF-IDF, RAKE, KPMiner, YAKE, KeyBERT, and variants of TextRank-based keyword extraction algorithms.
no code implementations • 30 Mar 2024 • Jakub Piskorski, Michał Marcińczuk, Roman Yangarber
The corpus consists of 5 017 documents on seven topics.
no code implementations • ACL (CASE) 2021 • Ali Hürriyetoğlu, Hristo Tanev, Vanni Zavarella, Jakub Piskorski, Reyyan Yeniterzi, Erdem Yörük
This workshop is the fourth issue of a series of workshops on automatic extraction of socio-political events from news, organized by the Emerging Market Welfare Project, with the support of the Joint Research Centre of the European Commission and with contributions from many other prominent scholars in this field.
no code implementations • COLING 2020 • Jakub Piskorski, Jacek Haneczok, Guillaume Jacquet
We introduce a new set of benchmark datasets derived from ACLED data for fine-grained event classification and compare the performance of various state-of-the-art models on these datasets, including SVM based on TF-IDF character n-grams and neural context-free embeddings (GLOVE and FASTTEXT) as well as deep learning-based BERT with its contextual embeddings.
no code implementations • LREC 2020 • Jakub Piskorski, Guillaume Jacquet
Automating the detection of event mentions in online texts and their classification vis-a-vis domain-specific event type taxonomies has been acknowledged by many organisations worldwide to be of paramount importance in order to facilitate the process of intelligence gathering.
no code implementations • WS 2019 • Guillaume Jacquet, Jakub Piskorski, Hristo Tanev, Ralf Steinberger
We report on the participation of the JRC Text Mining and Analysis Competence Centre (TMA-CC) in the BSNLP-2019 Shared Task, which focuses on named-entity recognition, lemmatisation and cross-lingual linking.
no code implementations • WS 2019 • Jakub Piskorski, Laska Laskova, Micha{\l} Marci{\'n}czuk, Lidia Pivovarova, Pavel P{\v{r}}ib{\'a}{\v{n}}, Josef Steinberger, Roman Yangarber
The task is recognizing mentions of named entities in Web documents, their normalization, and cross-lingual linking.
no code implementations • COLING 2018 • Jakub Piskorski, Fredi {\v{S}}ari{\'c}, Vanni Zavarella, Martin Atkinson
The paper reports on exploring various machine learning techniques and a range of textual and meta-data features to train classifiers for linking related event templates automatically extracted from online news.
no code implementations • WS 2017 • Martin Atkinson, Jakub Piskorski, Hristo Tanev, Vanni Zavarella
This paper reports on an effort of creating a corpus of structured information on security-related events automatically extracted from on-line news, part of which has been manually curated.
no code implementations • WS 2017 • Jakub Piskorski, Lidia Pivovarova, Jan {\v{S}}najder, Josef Steinberger, Roman Yangarber
The reported evaluation figures reflect the relatively higher level of complexity of named entity-related tasks in the context of processing texts in Slavic languages.
no code implementations • WS 2017 • Sophie Chesney, Guillaume Jacquet, Ralf Steinberger, Jakub Piskorski
This paper describes an approach for the classification of millions of existing multi-word entities (MWEntities), such as organisation or event names, into thirteen category types, based only on the tokens they contain.