Search Results for author: John Philip McCrae

Found 25 papers, 3 papers with code

A Dataset for Term Extraction in Hindi

no code implementations • TERM (LREC) 2022 • Shubhanker Banerjee, Bharathi Raja Chakravarthi, John Philip McCrae

Automatic Term Extraction (ATE) is one of the core problems in natural language processing and forms a key component of text mining pipelines of domain specific corpora.

Machine Translation Term Extraction

Paper
Add Code

Adaptation of Word-Level Benchmark Datasets for Relation-Level Metaphor Identification

no code implementations • WS 2020 • Omnia Zayed, John Philip McCrae, Paul Buitelaar

The majority of current approaches pertaining to metaphor processing concentrate on word-level processing due to data availability.

Relation

Paper
Add Code

NUIG at TIAD: Combining Unsupervised NLP and Graph Metrics for Translation Inference

no code implementations • LREC 2020 • John Philip McCrae, Mihael Arcan

In this paper, we present the NUIG system at the TIAD shard task.

Document Embedding Machine Translation +1

Paper
Add Code

Recent Developments for the Linguistic Linked Open Data Infrastructure

no code implementations • LREC 2020 • Thierry Declerck, John Philip McCrae, Matthias Hartung, Jorge Gracia, Christian Chiarcos, Elena Montiel-Ponsoda, Philipp Cimiano, Artem Revenko, Roser Saur{\'\i}, Deirdre Lee, Stefania Racioppa, Jamal Abdul Nasir, Matthias Orlikowsk, Marta Lanau-Coronas, Christian F{\"a}th, Mariano Rico, Mohammad Fazleh Elahi, Maria Khvalchik, Meritxell Gonzalez, Katharine Cooney

In this paper we describe the contributions made by the European H2020 project {``}Pr{\^e}t-{\`a}-LLOD{''} ({`}Ready-to-use Multilingual Linked Language Data for Knowledge Services across Sectors{'}) to the further development of the Linguistic Linked Open Data (LLOD) infrastructure.

Paper
Add Code

Modelling Frequency and Attestations for OntoLex-Lemon

no code implementations • LREC 2020 • Christian Chiarcos, Maxim Ionov, Jesse de Does, Katrien Depuydt, Anas Fahad Khan, S Stolk, er, Thierry Declerck, John Philip McCrae

Therefore, the OntoLex community has put forward the proposal for a novel module for frequency, attestation and corpus information (FrAC), that not only covers the requirements of digital lexicography, but also accommodates essential data structures for lexical information in natural language processing.

Paper
Add Code

Challenges of Word Sense Alignment: Portuguese Language Resources

no code implementations • LREC 2020 • Ana Salgado, Sina Ahmadi, Alberto Sim{\~o}es, John Philip McCrae, Rute Costa

Word sense alignment involves searching for matching senses within dictionary entries of different lexical resources and linking them, which poses significant challenges.

Paper
Add Code

A Comparative Study of Different State-of-the-Art Hate Speech Detection Methods in Hindi-English Code-Mixed Data

no code implementations • LREC 2020 • Priya Rani, Shardul Suryawanshi, Koustava Goswami, Bharathi Raja Chakravarthi, Theodorus Fransen, John Philip McCrae

Hate speech detection in social media communication has become one of the primary concerns to avoid conflicts and curb undesired activities.

Hate Speech Detection

Paper
Add Code

A Dataset for Troll Classification of TamilMemes

no code implementations • LREC 2020 • Shardul Suryawanshi, Bharathi Raja Chakravarthi, Pranav Verma, Mihael Arcan, John Philip McCrae, Paul Buitelaar

Social media are interactive platforms that facilitate the creation or sharing of information, ideas or other forms of expression among people.

Classification General Classification +1

Paper
Add Code

Some Issues with Building a Multilingual Wordnet

no code implementations • LREC 2020 • Francis Bond, Luis Morgado da Costa, Michael Wayne Goodman, John Philip McCrae, Ahti Lohk

In this paper we discuss the experience of bringing together over 40 different wordnets.

Paper
Add Code

Figure Me Out: A Gold Standard Dataset for Metaphor Interpretation

no code implementations • LREC 2020 • Omnia Zayed, John Philip McCrae, Paul Buitelaar

Metaphor comprehension and understanding is a complex cognitive task that requires interpreting metaphors by grasping the interaction between the meaning of their target and source concepts.

Retrieval Semantic Similarity +2

Paper
Add Code

A Multilingual Evaluation Dataset for Monolingual Word Sense Alignment

1 code implementation • LREC 2020 • Sina Ahmadi, John Philip McCrae, Sanni Nimb, Fahad Khan, Monica Monachini, Bolette Pedersen, Thierry Declerck, Tanja Wissik, Bell, Andrea i, Irene Pisani, Thomas Troelsg{\aa}rd, Sussi Olsen, Simon Krek, Veronika Lipp, Tam{\'a}s V{\'a}radi, L{\'a}szl{\'o} Simon, Andr{\'a}s Gyorffy, Carole Tiberius, Tanneke Schoonheim, Yifat Ben Moshe, Maya Rudich, Raya Abu Ahmad, Dorielle Lonke, Kira Kovalenko, Margit Langemets, Jelena Kallas, Oksana Dereza, Theodorus Fransen, David Cillessen, David Lindemann, Mikel Alonso, Ana Salgado, Jos{\'e} Luis Sancho, Rafael-J. Ure{\~n}a-Ruiz, Jordi Porta Zamorano, Kiril Simov, Petya Osenova, Zara Kancheva, Ivaylo Radev, Ranka Stankovi{\'c}, Andrej Perdih, Dejan Gabrovsek

Aligning senses across resources and languages is a challenging task with beneficial applications in the field of natural language processing and electronic lexicography.

Paper
Code

On the Linguistic Linked Open Data Infrastructure

no code implementations • LREC 2020 • Christian Chiarcos, Bettina Klimek, Christian F{\"a}th, Thierry Declerck, John Philip McCrae

In this paper we describe the current state of development of the Linguistic Linked Open Data (LLOD) infrastructure, an LOD(sub-)cloud of linguistic resources, which covers various linguistic data bases, lexicons, corpora, terminology and metadata repositories. We give in some details an overview of the contributions made by the European H2020 projects {``}Pr{\^e}t-{\`a}-LLOD{''} ({`}Ready-to-useMultilingual Linked Language Data for Knowledge Services across Sectors{'}) and {``}ELEXIS{''} ({`}European Lexicographic Infrastructure{'}) to the further development of the LLOD.

Paper
Add Code

English WordNet 2020: Improving and Extending a WordNet for English using an Open-Source Methodology

no code implementations • LREC 2020 • John Philip McCrae, Alex Rademaker, re, Ewa Rudnicka, Francis Bond

WordNet, while one of the most widely used resources for NLP, has not been updated for a long time, and as such a new project English WordNet has arisen to continue the development of the model under an open-source paradigm.

Paper
Add Code

Identification of Adjective-Noun Neologisms using Pretrained Language Models

1 code implementation • WS 2019 • John Philip McCrae

Neologism detection is a key task in the constructing of lexical resources and has wider implications for NLP, however the identification of multiword neologisms has received little attention.

Word Embeddings

Paper
Code

Phrase-Level Metaphor Identification Using Distributed Representations of Word Meaning

no code implementations • WS 2018 • Omnia Zayed, John Philip McCrae, Paul Buitelaar

Metaphor is an essential element of human cognition which is often used to express ideas and emotions that might be difficult to express using literal language.

Machine Translation Semantic Textual Similarity +3

Paper
Add Code

Expanding wordnets to new languages with multilingual sense disambiguation

no code implementations • COLING 2016 • Mihael Arcan, John Philip McCrae, Paul Buitelaar

The translation of wordnets is fundamentally complex because of the need to translate all senses of a word including low frequency senses, which is very challenging for current machine translation approaches.

Information Retrieval Machine Translation +3

Paper
Add Code

NUIG-UNLP at SemEval-2016 Task 1: Soft Alignment and Deep Learning for Semantic Textual Similarity

no code implementations • SEMEVAL 2016 • John Philip McCrae, Kartik Asooja, Nitish Aggarwal, Paul Buitelaar

Machine Translation Semantic Textual Similarity +1

Paper
Add Code

The Open Linguistics Working Group: Developing the Linguistic Linked Open Data Cloud

no code implementations • LREC 2016 • John Philip McCrae, Christian Chiarcos, Francis Bond, Philipp Cimiano, Thierry Declerck, Gerard de Melo, Jorge Gracia, Sebastian Hellmann, Bettina Klimek, Steven Moran, Petya Osenova, Antonio Pareja-Lora, Jonathan Pool

The Open Linguistics Working Group (OWLG) brings together researchers from various fields of linguistics, natural language processing, and information technology to present and discuss principles, case studies, and best practices for representing, publishing and linking linguistic data collections.