Search Results for author: Jakub Piskorski

Found 16 papers, 0 papers with code

Resources and Experiments on Sentiment Classification for Georgian

no code implementations LREC 2022 Nicolas Stefanovitch, Jakub Piskorski, Sopho Kharazi

The results of various experiments on the performance of both lexicon- and machine learning-based models for Georgian sentiment classification are also reported.

Classification Sentiment Analysis +3

Exploring Linguistically-Lightweight Keyword Extraction Techniques for Indexing News Articles in a Multilingual Set-up

no code implementations EACL (Hackashop) 2021 Jakub Piskorski, Nicolas Stefanovitch, Guillaume Jacquet, Aldo Podavini

This paper presents a study of state-of-the-art unsupervised and linguistically unsophisticated keyword extraction algorithms, based on statistic-, graph-, and embedding-based approaches, including, i. a., Total Keyword Frequency, TF-IDF, RAKE, KPMiner, YAKE, KeyBERT, and variants of TextRank-based keyword extraction algorithms.

Keyword Extraction

Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE 2021): Workshop and Shared Task Report

no code implementations ACL (CASE) 2021 Ali Hürriyetoğlu, Hristo Tanev, Vanni Zavarella, Jakub Piskorski, Reyyan Yeniterzi, Erdem Yörük

This workshop is the fourth issue of a series of workshops on automatic extraction of socio-political events from news, organized by the Emerging Market Welfare Project, with the support of the Joint Research Centre of the European Commission and with contributions from many other prominent scholars in this field.

Bias Detection Learning Word Embeddings +3

New Benchmark Corpus and Models for Fine-grained Event Classification: To BERT or not to BERT?

no code implementations COLING 2020 Jakub Piskorski, Jacek Haneczok, Guillaume Jacquet

We introduce a new set of benchmark datasets derived from ACLED data for fine-grained event classification and compare the performance of various state-of-the-art models on these datasets, including SVM based on TF-IDF character n-grams and neural context-free embeddings (GLOVE and FASTTEXT) as well as deep learning-based BERT with its contextual embeddings.

Classification

TF-IDF Character N-grams versus Word Embedding-based Models for Fine-grained Event Classification: A Preliminary Study

no code implementations LREC 2020 Jakub Piskorski, Guillaume Jacquet

Automating the detection of event mentions in online texts and their classification vis-a-vis domain-specific event type taxonomies has been acknowledged by many organisations worldwide to be of paramount importance in order to facilitate the process of intelligence gathering.

General Classification Word Embeddings

JRC TMA-CC: Slavic Named Entity Recognition and Linking. Participation in the BSNLP-2019 shared task

no code implementations WS 2019 Guillaume Jacquet, Jakub Piskorski, Hristo Tanev, Ralf Steinberger

We report on the participation of the JRC Text Mining and Analysis Competence Centre (TMA-CC) in the BSNLP-2019 Shared Task, which focuses on named-entity recognition, lemmatisation and cross-lingual linking.

named-entity-recognition Named Entity Recognition +1

On Training Classifiers for Linking Event Templates

no code implementations COLING 2018 Jakub Piskorski, Fredi {\v{S}}ari{\'c}, Vanni Zavarella, Martin Atkinson

The paper reports on exploring various machine learning techniques and a range of textual and meta-data features to train classifiers for linking related event templates automatically extracted from online news.

BIG-bench Machine Learning

On the Creation of a Security-Related Event Corpus

no code implementations WS 2017 Martin Atkinson, Jakub Piskorski, Hristo Tanev, Vanni Zavarella

This paper reports on an effort of creating a corpus of structured information on security-related events automatically extracted from on-line news, part of which has been manually curated.

Event Extraction

Multi-word Entity Classification in a Highly Multilingual Environment

no code implementations WS 2017 Sophie Chesney, Guillaume Jacquet, Ralf Steinberger, Jakub Piskorski

This paper describes an approach for the classification of millions of existing multi-word entities (MWEntities), such as organisation or event names, into thirteen category types, based only on the tokens they contain.

Classification General Classification +1

Cannot find the paper you are looking for? You can Submit a new open access paper.