no code implementations • ACL (CASE) 2021 • Jacek Haneczok, Guillaume Jacquet, Jakub Piskorski, Nicolas Stefanovitch
This paper describes the Shared Task on Fine-grained Event Classification in News-like Text Snippets.
no code implementations • EACL (Hackashop) 2021 • Jakub Piskorski, Nicolas Stefanovitch, Guillaume Jacquet, Aldo Podavini
This paper presents a study of state-of-the-art unsupervised and linguistically unsophisticated keyword extraction algorithms, based on statistic-, graph-, and embedding-based approaches, including, i. a., Total Keyword Frequency, TF-IDF, RAKE, KPMiner, YAKE, KeyBERT, and variants of TextRank-based keyword extraction algorithms.
no code implementations • 14 Nov 2022 • Francisco Casacuberta, Alexandru Ceausu, Khalid Choukri, Miltos Deligiannis, Miguel Domingo, Mercedes García-Martínez, Manuel Herranz, Guillaume Jacquet, Vassilis Papavassiliou, Stelios Piperidis, Prokopis Prokopidis, Dimitris Roussis, Marwa Hadj Salah
This work presents the results of the machine translation (MT) task from the Covid-19 MLIA @ Eval initiative, a community effort to improve the generation of MT systems focused on the current Covid-19 crisis.
no code implementations • COLING 2020 • Jakub Piskorski, Jacek Haneczok, Guillaume Jacquet
We introduce a new set of benchmark datasets derived from ACLED data for fine-grained event classification and compare the performance of various state-of-the-art models on these datasets, including SVM based on TF-IDF character n-grams and neural context-free embeddings (GLOVE and FASTTEXT) as well as deep learning-based BERT with its contextual embeddings.
no code implementations • LREC 2020 • Jakub Piskorski, Guillaume Jacquet
Automating the detection of event mentions in online texts and their classification vis-a-vis domain-specific event type taxonomies has been acknowledged by many organisations worldwide to be of paramount importance in order to facilitate the process of intelligence gathering.
no code implementations • WS 2019 • Guillaume Jacquet, Jakub Piskorski, Hristo Tanev, Ralf Steinberger
We report on the participation of the JRC Text Mining and Analysis Competence Centre (TMA-CC) in the BSNLP-2019 Shared Task, which focuses on named-entity recognition, lemmatisation and cross-lingual linking.
no code implementations • WS 2017 • Sophie Chesney, Guillaume Jacquet, Ralf Steinberger, Jakub Piskorski
This paper describes an approach for the classification of millions of existing multi-word entities (MWEntities), such as organisation or event names, into thirteen category types, based only on the tokens they contain.
no code implementations • 18 Jul 2016 • Matthias Galle, Jean-Michel Renders, Guillaume Jacquet
Clustering web documents has numerous applications, such as aggregating news articles into meaningful events, detecting trends and hot topics on the Web, preserving diversity in search results, etc.
no code implementations • LREC 2016 • Guillaume Jacquet, Maud Ehrmann, Ralf Steinberger, Jaakko V{\"a}yrynen
This paper reports on an approach and experiments to automatically build a cross-lingual multi-word entity resource.
no code implementations • 8 Mar 2016 • Ralf Steinberger, Aldo Podavini, Alexandra Balahur, Guillaume Jacquet, Hristo Tanev, Jens Linge, Martin Atkinson, Michele Chinosi, Vanni Zavarella, Yaniv Steiner, Erik van der Goot
Any large organisation, be it public or private, monitors the media for information to keep abreast of developments in their field of interest, and usually also to become aware of positive or negative opinions expressed towards them.
no code implementations • LREC 2014 • Guillaume Jacquet, Maud Ehrmann, Ralf Steinberger
Multi-word entities, such as organisation names, are frequently written in many different ways.
no code implementations • LREC 2014 • Dilek K{\"u}{\c{c}}{\"u}k, Guillaume Jacquet, Ralf Steinberger
Various recent studies show that the performance of named entity recognition (NER) systems developed for well-formed text types drops significantly when applied to tweets.
no code implementations • LREC 2014 • Alex Balahur, ra, Marco Turchi, Ralf Steinberger, Jose-Manuel Perea-Ortega, Guillaume Jacquet, Dilek K{\"u}{\c{c}}{\"u}k, Vanni Zavarella, Adil El Ghali
We show that the use of machine translated data obtained similar results as the use of native-speaker translations of the same data.