Search Results for author: Ivan Vuli{\'c}

Found 54 papers, 6 papers with code

From Zero to Hero: On the Limitations of Zero-Shot Language Transfer with Multilingual Transformers

no code implementations • EMNLP 2020 • Anne Lauscher, Vinit Ravishankar, Ivan Vuli{\'c}, Goran Glava{\v{s}}

Massively multilingual transformers (MMTs) pretrained via language modeling (e. g., mBERT, XLM-R) have become a default paradigm for zero-shot language transfer in NLP, offering unmatched transfer performance.

Cross-Lingual Word Embeddings Dependency Parsing +5

Paper
Add Code

LexFit: Lexical Fine-Tuning of Pretrained Language Models

no code implementations • ACL 2021 • Ivan Vuli{\'c}, Edoardo Maria Ponti, Anna Korhonen, Goran Glava{\v{s}}

Inspired by prior work on semantic specialization of static word embedding (WE) models, we show that it is possible to expose and enrich lexical knowledge from the LMs, that is, to specialize them to serve as effective and universal {``}decontextualized{''} word encoders even when fed input words {``}in isolation{''} (i. e., without any context).

Cross-Lingual Transfer

Paper
Add Code

Is Supervised Syntactic Parsing Beneficial for Language Understanding Tasks? An Empirical Investigation

no code implementations • EACL 2021 • Goran Glava{\v{s}}, Ivan Vuli{\'c}

Traditional NLP has long held (supervised) syntactic parsing necessary for successful higher-level semantic language understanding (LU).

Language Modelling Natural Language Understanding

Paper
Add Code

Manual Clustering and Spatial Arrangement of Verbs for Multilingual Evaluation and Typology Analysis

1 code implementation • COLING 2020 • Olga Majewska, Ivan Vuli{\'c}, Diana McCarthy, Anna Korhonen

We present the first evaluation of the applicability of a spatial arrangement method (SpAM) to a typologically diverse language sample, and its potential to produce semantic evaluation resources to support multilingual NLP, with a focus on verb semantics.

Clustering Multilingual NLP +1

Paper
Code

SemEval-2020 Task 3: Graded Word Similarity in Context

no code implementations • SEMEVAL 2020 • Carlos Santos Armendariz, Matthew Purver, Senja Pollak, Nikola Ljube{\v{s}}i{\'c}, Matej Ul{\v{c}}ar, Ivan Vuli{\'c}, Mohammad Taher Pilehvar

This paper presents the Graded Word Similarity in Context (GWSC) task which asked participants to predict the effects of context on human perception of similarity in English, Croatian, Slovene and Finnish.

Translation Word Similarity

Paper
Add Code

SemEval-2020 Task 2: Predicting Multilingual and Cross-Lingual (Graded) Lexical Entailment

no code implementations • SEMEVAL 2020 • Goran Glava{\v{s}}, Ivan Vuli{\'c}, Anna Korhonen, Simone Paolo Ponzetto

The shared task spans three dimensions: (1) monolingual vs. cross-lingual LE, (2) binary vs. graded LE, and (3) a set of 6 diverse languages (and 15 corresponding language pairs).

Lexical Entailment Natural Language Inference +1

Paper
Add Code

XHate-999: Analyzing and Detecting Abusive Language Across Domains and Languages

no code implementations • COLING 2020 • Goran Glava{\v{s}}, Mladen Karan, Ivan Vuli{\'c}

We present XHate-999, a multi-domain and multilingual evaluation data set for abusive language detection.

Abusive Language Disentanglement +2

Paper
Add Code

Classification-Based Self-Learning for Weakly Supervised Bilingual Lexicon Induction

no code implementations • ACL 2020 • Mladen Karan, Ivan Vuli{\'c}, Anna Korhonen, Goran Glava{\v{s}}

Effective projection-based cross-lingual word embedding (CLWE) induction critically relies on the iterative self-learning procedure.

Bilingual Lexicon Induction Classification +3

Paper
Add Code

Improving Bilingual Lexicon Induction with Unsupervised Post-Processing of Monolingual Word Vector Spaces

no code implementations • WS 2020 • Ivan Vuli{\'c}, Anna Korhonen, Goran Glava{\v{s}}

Work on projection-based induction of cross-lingual word embedding spaces (CLWEs) predominantly focuses on the improvement of the projection (i. e., mapping) mechanisms.

Bilingual Lexicon Induction

Paper
Add Code

Non-Linear Instance-Based Cross-Lingual Mapping for Non-Isomorphic Embedding Spaces

no code implementations • ACL 2020 • Goran Glava{\v{s}}, Ivan Vuli{\'c}

We present InstaMap, an instance-based method for learning projection-based cross-lingual word embeddings.

Bilingual Lexicon Induction Cross-Lingual Word Embeddings +2

Paper
Add Code

Spatial Multi-Arrangement for Clustering and Multi-way Similarity Dataset Construction

no code implementations • LREC 2020 • Olga Majewska, Diana McCarthy, Jasper van den Bosch, Nikolaus Kriegeskorte, Ivan Vuli{\'c}, Anna Korhonen

We present a novel methodology for fast bottom-up creation of large-scale semantic similarity resources to support development and evaluation of NLP systems.

Clustering Semantic Similarity +2

Paper
Add Code

Cross-lingual Semantic Specialization via Lexical Relation Induction

no code implementations • IJCNLP 2019 • Edoardo Maria Ponti, Ivan Vuli{\'c}, Goran Glava{\v{s}}, Roi Reichart, Anna Korhonen

Semantic specialization integrates structured linguistic knowledge from external resources (such as lexical relations in WordNet) into pretrained distributional vectors in the form of constraints.

dialog state tracking Lexical Simplification +4

Paper
Add Code

Investigating Cross-Lingual Alignment Methods for Contextualized Embeddings with Token-Level Evaluation

no code implementations • CONLL 2019 • Qianchu Liu, Diana McCarthy, Ivan Vuli{\'c}, Anna Korhonen

In this paper, we present a thorough investigation on methods that align pre-trained contextualized embeddings into shared cross-lingual context-aware embedding space, providing strong reference benchmarks for future context-aware crosslingual models.

Retrieval Sentence +1

Paper
Add Code

Hello, It's GPT-2 - How Can I Help You? Towards the Use of Pretrained Language Models for Task-Oriented Dialogue Systems

no code implementations • WS 2019 • Pawe{\l} Budzianowski, Ivan Vuli{\'c}

Data scarcity is a long-standing and crucial challenge that hinders quick development of task-oriented dialogue systems across multiple domains: task-oriented dialogue models are expected to learn grammar, syntax, dialogue reasoning, decision making, and language generation from absurdly small amounts of task-specific data.

Decision Making Language Modelling +3

Paper
Add Code

Specializing Distributional Vectors of All Words for Lexical Entailment

no code implementations • WS 2019 • Aishwarya Kamath, Jonas Pfeiffer, Edoardo Maria Ponti, Goran Glava{\v{s}}, Ivan Vuli{\'c}

Semantic specialization methods fine-tune distributional word vectors using lexical knowledge from external resources (e. g. WordNet) to accentuate a particular relation between words.

Cross-Lingual Transfer Lexical Entailment +3

Paper
Add Code

JW300: A Wide-Coverage Parallel Corpus for Low-Resource Languages

no code implementations • ACL 2019 • {\v{Z}}eljko Agi{\'c}, Ivan Vuli{\'c}

Viable cross-lingual transfer critically depends on the availability of parallel texts.

Cross-Lingual Transfer

Paper
Add Code

Unsupervised Cross-Lingual Representation Learning

no code implementations • ACL 2019 • Sebastian Ruder, Anders S{\o}gaard, Ivan Vuli{\'c}

In this tutorial, we provide a comprehensive survey of the exciting recent work on cutting-edge weakly-supervised and unsupervised cross-lingual word representations.

Representation Learning Structured Prediction

Paper
Add Code

Multilingual and Cross-Lingual Graded Lexical Entailment

no code implementations • ACL 2019 • Ivan Vuli{\'c}, Simone Paolo Ponzetto, Goran Glava{\v{s}}

Starting from HyperLex, the only available GR-LE dataset in English, we construct new monolingual GR-LE datasets for three other languages, and combine those to create a set of six cross-lingual GR-LE datasets termed CL-HYPERLEX.

Lexical Entailment

Paper
Add Code

Generalized Tuning of Distributional Word Vectors for Monolingual and Cross-Lingual Lexical Entailment

1 code implementation • ACL 2019 • Goran Glava{\v{s}}, Ivan Vuli{\'c}

Lexical entailment (LE; also known as hyponymy-hypernymy or is-a relation) is a core asymmetric lexical relation that supports tasks like taxonomy induction and text generation.

Lexical Entailment Relation +1

Paper
Code

Show Some Love to Your n-grams: A Bit of Progress and Stronger n-gram Language Modeling Baselines

no code implementations • NAACL 2019 • Ehsan Shareghi, Daniela Gerz, Ivan Vuli{\'c}, Anna Korhonen

In recent years neural language models (LMs) have set the state-of-the-art performance for several benchmarking datasets.

Benchmarking Language Modelling

Paper
Add Code

Learning Unsupervised Multilingual Word Embeddings with Incremental Multilingual Hubs

no code implementations • NAACL 2019 • Geert Heyman, Bregt Verreet, Ivan Vuli{\'c}, Marie-Francine Moens

We learn a shared multilingual embedding space for a variable number of languages by incrementally adding new languages one by one to the current multilingual space.

Bilingual Lexicon Induction Cross-Lingual Word Embeddings +4

Paper
Add Code

On the Relation between Linguistic Typology and (Limitations of) Multilingual Language Modeling

no code implementations • EMNLP 2018 • Daniela Gerz, Ivan Vuli{\'c}, Edoardo Maria Ponti, Roi Reichart, Anna Korhonen

A key challenge in cross-lingual NLP is developing general language-independent architectures that are equally applicable to any language.

Language Modelling Relation

Paper
Add Code

Explicit Retrofitting of Distributional Word Vectors

no code implementations • ACL 2018 • Goran Glava{\v{s}}, Ivan Vuli{\'c}

The ER model allows us to learn a global specialization function and specialize the vectors of words unobserved in the training data as well.

dialog state tracking Lexical Simplification +3

Paper
Add Code

Isomorphic Transfer of Syntactic Structures in Cross-Lingual NLP

no code implementations • ACL 2018 • Edoardo Maria Ponti, Roi Reichart, Anna Korhonen, Ivan Vuli{\'c}

The transfer or share of knowledge between languages is a potential solution to resource scarcity in NLP.

Cross-Lingual Transfer Machine Translation +5

Paper
Add Code

Bridging Languages through Images with Deep Partial Canonical Correlation Analysis

1 code implementation • ACL 2018 • Guy Rotman, Ivan Vuli{\'c}, Roi Reichart

We present a deep neural network that leverages images to improve bilingual text embeddings.

Image Retrieval Question Answering +4

Paper
Code

Injecting Lexical Contrast into Word Vectors by Guiding Vector Space Specialisation

no code implementations • WS 2018 • Ivan Vuli{\'c}

Word vector space specialisation models offer a portable, light-weight approach to fine-tuning arbitrary distributional vector spaces to discern between synonymy and antonymy.

Dialogue State Tracking Representation Learning +4

Paper
Add Code

Fully Statistical Neural Belief Tracking

1 code implementation • ACL 2018 • Nikola Mrk{\v{s}}i{\'c}, Ivan Vuli{\'c}

This paper proposes an improvement to the existing data-driven Neural Belief Tracking (NBT) framework for Dialogue State Tracking (DST).

Dialogue Management Dialogue State Tracking +2

167

Paper
Code

Specialising Word Vectors for Lexical Entailment

1 code implementation • NAACL 2018 • Ivan Vuli{\'c}, Nikola Mrk{\v{s}}i{\'c}

We present LEAR (Lexical Entailment Attract-Repel), a novel post-processing method that transforms any input word vector space to emphasise the asymmetric relation of lexical entailment (LE), also known as the IS-A or hyponymy-hypernymy relation.

Ranked #1 on Lexical Entailment on HyperLex

Dialogue State Tracking Lexical Entailment +8

Paper
Code

Deep Learning for Conversational AI

no code implementations • NAACL 2018 • Pei-Hao Su, Nikola Mrk{\v{s}}i{\'c}, I{\~n}igo Casanueva, Ivan Vuli{\'c}

The main purpose of this tutorial is to encourage dialogue research in the NLP community by providing the research background, a survey of available resources, and giving key insights to application of state-of-the-art SDS methodology into industry-scale conversational AI systems.

Decision Making Dialogue Management +5

Paper
Add Code

Discriminating between Lexico-Semantic Relations with the Specialization Tensor Model

1 code implementation • NAACL 2018 • Goran Glava{\v{s}}, Ivan Vuli{\'c}

We present a simple and effective feed-forward neural architecture for discriminating between lexico-semantic relations (synonymy, antonymy, hypernymy, and meronymy).

Natural Language Inference Paraphrase Generation +2

Paper
Code

Acquiring Verb Classes Through Bottom-Up Semantic Verb Clustering

no code implementations • LREC 2018 • Olga Majewska, Diana McCarthy, Ivan Vuli{\'c}, Anna Korhonen

Clustering Semantic Textual Similarity

Paper
Add Code

Language Modeling for Morphologically Rich Languages: Character-Aware Modeling for Word-Level Prediction

no code implementations • TACL 2018 • Daniela Gerz, Ivan Vuli{\'c}, Edoardo Ponti, Jason Naradowsky, Roi Reichart, Anna Korhonen

Neural architectures are prominent in the construction of language models (LMs).

Dialogue Generation Language Modelling +1

Paper
Add Code

Cross-Lingual Word Representations: Induction and Evaluation

no code implementations • EMNLP 2017 • Manaal Faruqui, Anders S{\o}gaard, Ivan Vuli{\'c}

With the increasing use of monolingual word vectors, there is a need for word vectors that can be used as efficiently across multiple languages as monolingually.

Multilingual Word Embeddings

Paper
Add Code

Evaluation by Association: A Systematic Study of Quantitative Word Association Evaluation

no code implementations • EACL 2017 • Ivan Vuli{\'c}, Douwe Kiela, Anna Korhonen

Recent work on evaluating representation learning architectures in NLP has established a need for evaluation protocols based on subconscious cognitive measures rather than manually tailored intrinsic similarity and relatedness tasks.

Information Retrieval Representation Learning +2

Paper
Add Code

Word Vector Space Specialisation

no code implementations • EACL 2017 • Ivan Vuli{\'c}, Nikola Mrk{\v{s}}i{\'c}, Mohammad Taher Pilehvar

Specialising vector spaces to maximise their content with respect to one key property of vector space models (e. g. semantic similarity vs. relatedness or lexical entailment) while mitigating others has become an active and attractive research topic in representation learning.

Lexical Entailment Representation Learning +2

Paper
Add Code

Cross-Lingual Syntactically Informed Distributed Word Representations

no code implementations • EACL 2017 • Ivan Vuli{\'c}

We develop a novel cross-lingual word representation model which injects syntactic information through dependency-based contexts into a shared cross-lingual word vector space.

Bilingual Lexicon Induction Entity Linking +9

Paper
Add Code

Bilingual Lexicon Induction by Learning to Combine Word-Level and Character-Level Representations

no code implementations • EACL 2017 • Geert Heyman, Ivan Vuli{\'c}, Marie-Francine Moens

We study the problem of bilingual lexicon induction (BLI) in a setting where some translation resources are available, but unknown translations are sought for certain, possibly domain-specific terminology.

Bilingual Lexicon Induction Classification +10

Paper
Add Code

If Sentences Could See: Investigating Visual Information for Semantic Textual Similarity

no code implementations • WS 2017 • Goran Glava{\v{s}}, Ivan Vuli{\'c}, Simone Paolo Ponzetto

Machine Translation Semantic Textual Similarity

Paper
Add Code

Semantic Specialization of Distributional Word Vector Spaces using Monolingual and Cross-Lingual Constraints

no code implementations • TACL 2017 • Nikola Mrk{\v{s}}i{\'c}, Ivan Vuli{\'c}, Diarmuid {\'O} S{\'e}aghdha, Ira Leviant, Roi Reichart, Milica Ga{\v{s}}i{\'c}, Anna Korhonen, Steve Young

We present Attract-Repel, an algorithm for improving the semantic quality of word vectors by injecting constraints extracted from lexical resources.

Dialogue State Tracking Representation Learning +2

Paper
Add Code

On the Role of Seed Lexicons in Learning Bilingual Word Embeddings

no code implementations • ACL 2016 • Ivan Vuli{\'c}, Anna Korhonen

Cross-Lingual Entity Linking Entity Linking +1

Paper
Add Code

Is ``Universal Syntax'' Universally Useful for Learning Distributed Word Representations?

no code implementations • ACL 2016 • Ivan Vuli{\'c}, Anna Korhonen

Word Embeddings

Paper
Add Code

Multi-Modal Representations for Improved Bilingual Lexicon Learning

no code implementations • ACL 2016 • Ivan Vuli{\'c}, Douwe Kiela, Stephen Clark, Marie-Francine Moens

Information Retrieval Machine Translation +1

Paper
Add Code

Visual Bilingual Lexicon Induction with Transferred ConvNet Features

no code implementations • EMNLP 2015 • Douwe Kiela, Ivan Vuli{\'c}, Stephen Clark

Bilingual Lexicon Induction Information Retrieval

Paper
Add Code

Exploiting Image Generality for Lexical Entailment Detection

no code implementations • IJCNLP 2015 • Douwe Kiela, Laura Rimell, Ivan Vuli{\'c}, Stephen Clark

Image Retrieval Lexical Entailment +3

Paper
Add Code

Bilingual Word Embeddings from Non-Parallel Document-Aligned Data Applied to Bilingual Lexicon Induction

no code implementations • IJCNLP 2015 • Ivan Vuli{\'c}, Marie-Francine Moens

Bilingual Lexicon Induction Language Modelling +1

Paper
Add Code

TKLBLIIR: Detecting Twitter Paraphrases with TweetingJay

no code implementations • SEMEVAL 2015 • Mladen Karan, Goran Glava{\v{s}}, Jan {\v{S}}najder, Bojana Dalbelo Ba{\v{s}}i{\'c}, Ivan Vuli{\'c}, Marie-Francine Moens

Information Retrieval Machine Translation +2

Paper
Add Code

Probabilistic Models of Cross-Lingual Semantic Similarity in Context Based on Latent Cross-Lingual Concepts Induced from Comparable Data

no code implementations • EMNLP 2014 • Ivan Vuli{\'c}, Marie-Francine Moens

Information Retrieval Machine Translation +6

Paper
Add Code

TermWise: A CAT-tool with Context-Sensitive Terminological Support.

no code implementations • LREC 2014 • Kris Heylen, Stephen Bond, Dirk De Hertog, Ivan Vuli{\'c}, Hendrik Kockaert

In this paper, we report on the TermWise project, a cooperation of terminologists, corpus linguists and computer scientists, that aims to leverage big online translation data for terminological support to legal translators at the Belgian Federal Ministry of Justice.

Translation