Search Results for author: Avi Caciularu

Found 38 papers, 19 papers with code

Within-Between Lexical Relation Classification

no code implementations • EMNLP 2020 • Oren Barkan, Avi Caciularu, Ido Dagan

We propose the novel \textit{Within-Between} Relation model for recognizing lexical-semantic relations between words.

Classification General Classification +2

Paper
Add Code

Unpacking Tokenization: Evaluating Text Compression and its Correlation with Model Performance

no code implementations • 10 Mar 2024 • Omer Goldman, Avi Caciularu, Matan Eyal, Kris Cao, Idan Szpektor, Reut Tsarfaty

Despite it being the cornerstone of BPE, the most common tokenization algorithm, the importance of compression in the tokenization process is still unclear.

Language Modelling Text Compression

Paper
Add Code

Patchscopes: A Unifying Framework for Inspecting Hidden Representations of Language Models

1 code implementation • 11 Jan 2024 • Asma Ghandeharioun, Avi Caciularu, Adam Pearce, Lucas Dixon, Mor Geva

Inspecting the information encoded in hidden representations of large language models (LLMs) can explain models' behavior and verify their alignment with human values.

109

Paper
Code

Optimizing Retrieval-augmented Reader Models via Token Elimination

1 code implementation • 20 Oct 2023 • Moshe Berchansky, Peter Izsak, Avi Caciularu, Ido Dagan, Moshe Wasserblat

Fusion-in-Decoder (FiD) is an effective retrieval-augmented language model applied across a variety of open-domain tasks, such as question answering, fact checking, etc.

Answer Generation Fact Checking +3

Paper
Code

The Curious Case of Hallucinatory (Un)answerability: Finding Truths in the Hidden States of Over-Confident Large Language Models

1 code implementation • 18 Oct 2023 • Aviv Slobodkin, Omer Goldman, Avi Caciularu, Ido Dagan, Shauli Ravfogel

In this paper, we explore the behavior of LLMs when presented with (un)answerable queries.

Management

Paper
Code

A Comprehensive Evaluation of Tool-Assisted Generation Strategies

no code implementations • 16 Oct 2023 • Alon Jacovi, Avi Caciularu, Jonathan Herzig, Roee Aharoni, Bernd Bohnet, Mor Geva

A growing area of research investigates augmenting language models with tools (e. g., search engines, calculators) to overcome their shortcomings (e. g., missing or incorrect knowledge, incorrect logical inferences).

Retrieval

Paper
Add Code

Dont Add, dont Miss: Effective Content Preserving Generation from Pre-Selected Text Spans

1 code implementation • 13 Oct 2023 • Aviv Slobodkin, Avi Caciularu, Eran Hirsch, Ido Dagan

Further, we substantially improve the silver training data quality via GPT-4 distillation.

Text Generation

Paper
Code

Representation Learning via Variational Bayesian Networks

no code implementations • 28 Jun 2023 • Oren Barkan, Avi Caciularu, Idan Rejwan, Ori Katz, Jonathan Weill, Itzik Malkiel, Noam Koenigstein

We present Variational Bayesian Network (VBN) - a novel Bayesian entity representation learning model that utilizes hierarchical and relational side information and is particularly useful for modeling entities in the ``long-tail'', where the data is scarce.

Bayesian Inference Representation Learning

Paper
Add Code

Revisiting Sentence Union Generation as a Testbed for Text Consolidation

1 code implementation • 24 May 2023 • Eran Hirsch, Valentina Pyatkin, Ruben Wolhandler, Avi Caciularu, Asi Shefer, Ido Dagan

In this paper, we suggest revisiting the sentence union generation task as an effective well-defined testbed for assessing text consolidation capabilities, decoupling the consolidation challenge from subjective content selection.

Document Summarization Long Form Question Answering +2

Paper
Code

Peek Across: Improving Multi-Document Modeling via Cross-Document Question-Answering

1 code implementation • 24 May 2023 • Avi Caciularu, Matthew E. Peters, Jacob Goldberger, Ido Dagan, Arman Cohan

The integration of multi-document pre-training objectives into language models has resulted in remarkable improvements in multi-document downstream tasks.

Query-focused Summarization Question Answering +2

Paper
Code

Stop Uploading Test Data in Plain Text: Practical Strategies for Mitigating Data Contamination by Evaluation Benchmarks

1 code implementation • 17 May 2023 • Alon Jacovi, Avi Caciularu, Omer Goldman, Yoav Goldberg

Data contamination has become prevalent and challenging with the rise of models pretrained on large automatically-crawled corpora.

Paper
Code

Cross-document Event Coreference Search: Task, Dataset and Modeling

1 code implementation • 23 Oct 2022 • Alon Eirew, Avi Caciularu, Ido Dagan

The task of Cross-document Coreference Resolution has been traditionally formulated as requiring to identify all coreference links across a given set of documents.

Cross Document Coreference Resolution Open-Domain Question Answering +2

Paper
Code

Interpreting BERT-based Text Similarity via Activation and Saliency Maps

no code implementations • 13 Aug 2022 • Itzik Malkiel, Dvir Ginzburg, Oren Barkan, Avi Caciularu, Jonathan Weill, Noam Koenigstein

Recently, there has been growing interest in the ability of Transformer-based models to produce meaningful embeddings of text with several applications, such as text similarity.

text similarity

Paper
Add Code

MetricBERT: Text Representation Learning via Self-Supervised Triplet Training

no code implementations • 13 Aug 2022 • Itzik Malkiel, Dvir Ginzburg, Oren Barkan, Avi Caciularu, Yoni Weill, Noam Koenigstein

We present MetricBERT, a BERT-based model that learns to embed text under a well-defined similarity metric while simultaneously adhering to the ``traditional'' masked-language task.

Representation Learning

Paper
Add Code

QASem Parsing: Text-to-text Modeling of QA-based Semantics

1 code implementation • 23 May 2022 • Ayal Klein, Eran Hirsch, Ron Eliav, Valentina Pyatkin, Avi Caciularu, Ido Dagan

Several recent works have suggested to represent semantic relations with questions and answers, decomposing textual information into separate interrogative natural language statements.

Data Augmentation

Paper
Code

LM-Debugger: An Interactive Tool for Inspection and Intervention in Transformer-Based Language Models

1 code implementation • 26 Apr 2022 • Mor Geva, Avi Caciularu, Guy Dar, Paul Roit, Shoval Sadde, Micah Shlain, Bar Tamir, Yoav Goldberg

The opaque nature and unexplained behavior of transformer-based language models (LMs) have spurred a wide interest in interpreting their predictions.

153

Paper
Code

Grad-SAM: Explaining Transformers via Gradient Self-Attention Maps

no code implementations • 23 Apr 2022 • Oren Barkan, Edan Hauon, Avi Caciularu, Ori Katz, Itzik Malkiel, Omri Armstrong, Noam Koenigstein

Transformer-based language models significantly advanced the state-of-the-art in many linguistic tasks.

Paper
Add Code

Transformer Feed-Forward Layers Build Predictions by Promoting Concepts in the Vocabulary Space

1 code implementation • 28 Mar 2022 • Mor Geva, Avi Caciularu, Kevin Ro Wang, Yoav Goldberg

Transformer-based language models (LMs) are at the core of modern NLP, but their internal prediction construction process is opaque and largely not understood.

Paper
Code

Long Context Question Answering via Supervised Contrastive Learning

2 code implementations • NAACL 2022 • Avi Caciularu, Ido Dagan, Jacob Goldberger, Arman Cohan

Long-context question answering (QA) tasks require reasoning over a long document or multiple documents.

Contrastive Learning Question Answering

Paper
Code

Proposition-Level Clustering for Multi-Document Summarization

2 code implementations • NAACL 2022 • Ori Ernst, Avi Caciularu, Ori Shapira, Ramakanth Pasunuru, Mohit Bansal, Jacob Goldberger, Ido Dagan

Text clustering methods were traditionally incorporated into multi-document summarization (MDS) as a means for coping with considerable information repetition.

Clustering Document Summarization +3

Paper
Code

Cold Item Integration in Deep Hybrid Recommenders via Tunable Stochastic Gates

no code implementations • 12 Dec 2021 • Oren Barkan, Roy Hirsch, Ori Katz, Avi Caciularu, Jonathan Weill, Noam Koenigstein

Next, we propose a novel hybrid recommendation algorithm that bridges these two conflicting objectives and enables a harmonized balance between preserving high accuracy for warm items while effectively promoting completely cold items.

Collaborative Filtering

Paper
Add Code

iFacetSum: Coreference-based Interactive Faceted Summarization for Multi-Document Exploration

1 code implementation • EMNLP (ACL) 2021 • Eran Hirsch, Alon Eirew, Ori Shapira, Avi Caciularu, Arie Cattan, Ori Ernst, Ramakanth Pasunuru, Hadar Ronen, Mohit Bansal, Ido Dagan

We introduce iFacetSum, a web application for exploring topical document sets.

Paper
Code

GAM: Explainable Visual Similarity and Classification via Gradient Activation Maps

no code implementations • 2 Sep 2021 • Oren Barkan, Omri Armstrong, Amir Hertz, Avi Caciularu, Ori Katz, Itzik Malkiel, Noam Koenigstein

The algorithmic advantages of GAM are explained in detail, and validated empirically, where it is shown that GAM outperforms its alternatives across various tasks and datasets.

Classification

Paper
Add Code

Denoising Word Embeddings by Averaging in a Shared Space

no code implementations • Joint Conference on Lexical and Computational Semantics 2021 • Avi Caciularu, Ido Dagan, Jacob Goldberger

We introduce a new approach for smoothing and improving the quality of word embeddings.

Denoising Translation +2

Paper
Add Code

Self-Supervised Document Similarity Ranking via Contextualized Language Models and Hierarchical Inference

1 code implementation • Findings (ACL) 2021 • Dvir Ginzburg, Itzik Malkiel, Oren Barkan, Avi Caciularu, Noam Koenigstein

Hence, we introduce SDR, a self-supervised method for document similarity that can be applied to documents of arbitrary length.

Paper
Code

On the Evolution of Word Order

no code implementations • RANLP 2021 • Idan Rejwan, Avi Caciularu

We also show that adding information to the sentence, such as case markers and noun-verb distinction, reduces the need for fixed word order, in accordance with the typological findings.

Sentence

Paper
Add Code

CDLM: Cross-Document Language Modeling

2 code implementations • Findings (EMNLP) 2021 • Avi Caciularu, Arman Cohan, Iz Beltagy, Matthew E. Peters, Arie Cattan, Ido Dagan

We introduce a new pretraining approach geared for multi-document language modeling, incorporating two key ideas into the masked language modeling self-supervised objective.

Ranked #1 on Citation Recommendation on AAN test

Citation Recommendation Coreference Resolution +6

Paper
Code

A Mixture of Variational Autoencoders for Deep Clustering

no code implementations • 1 Jan 2021 • Avi Caciularu, Jacob Goldberger

In this study we propose a deep clustering algorithm that utilizes variational auto encoder (VAE) framework with a multi encoder-decoder neural architecture.

Clustering Deep Clustering

Paper
Add Code

Graph Permutation Selection for Decoding of Error Correction Codes using Self-Attention

no code implementations • 1 Jan 2021 • Nir Raviv, Avi Caciularu, Tomer Raviv, Jacob Goldberger, Yair Be'ery

Error correction codes are an integral part of communication applications and boost the reliability of transmission.

Paper
Add Code

RecoBERT: A Catalog Language Model for Text-Based Recommendations

no code implementations • Findings of the Association for Computational Linguistics 2020 • Itzik Malkiel, Oren Barkan, Avi Caciularu, Noam Razin, Ori Katz, Noam Koenigstein

In addition, we introduce a new language understanding task for wine recommendations using similarities based on professional wine reviews.

Language Modelling

Paper
Add Code

Paraphrasing vs Coreferring: Two Sides of the Same Coin

2 code implementations • Findings of the Association for Computational Linguistics 2020 • Yehudit Meged, Avi Caciularu, Vered Shwartz, Ido Dagan

We study the potential synergy between two different NLP tasks, both confronting predicate lexical variability: identifying predicate paraphrases, and event coreference resolution.

Ranked #6 on Event Cross-Document Coreference Resolution on ECB+ test

coreference-resolution Event Coreference Resolution +3

Paper
Code

Bayesian Hierarchical Words Representation Learning

no code implementations • ACL 2020 • Oren Barkan, Idan Rejwan, Avi Caciularu, Noam Koenigstein

BHWR facilitates Variational Bayes word representation learning combined with semantic taxonomy modeling via hierarchical priors.

Representation Learning

Paper
Add Code

Attentive Item2Vec: Neural Attentive User Representations

no code implementations • 15 Feb 2020 • Oren Barkan, Avi Caciularu, Ori Katz, Noam Koenigstein

However, it is possible that a certain early movie may become suddenly more relevant in the presence of a popular sequel movie.

Recommendation Systems

Paper
Add Code

perm2vec: Graph Permutation Selection for Decoding of Error Correction Codes using Self-Attention

no code implementations • 6 Feb 2020 • Nir Raviv, Avi Caciularu, Tomer Raviv, Jacob Goldberger, Yair Be'ery

Error correction codes are an integral part of communication applications, boosting the reliability of transmission.

Paper
Add Code

Scalable Attentive Sentence-Pair Modeling via Distilled Sentence Embedding

1 code implementation • 14 Aug 2019 • Oren Barkan, Noam Razin, Itzik Malkiel, Ori Katz, Avi Caciularu, Noam Koenigstein

In this paper, we introduce Distilled Sentence Embedding (DSE) - a model that is based on knowledge distillation from cross-attentive models, focusing on sentence-pair tasks.

Knowledge Distillation Natural Language Understanding +4

Paper
Code

Unsupervised Linear and Nonlinear Channel Equalization and Decoding using Variational Autoencoders

no code implementations • 21 May 2019 • Avi Caciularu, David Burshtein

We first consider the reconstruction of uncoded data symbols transmitted over a noisy linear intersymbol interference (ISI) channel, with an unknown impulse response, without using pilot symbols.

Variational Inference

Paper
Add Code

Blind Channel Equalization using Variational Autoencoders

no code implementations • 5 Mar 2018 • Avi Caciularu, David Burshtein

A new maximum likelihood estimation approach for blind channel equalization, using variational autoencoders (VAEs), is introduced.

Paper
Add Code

Inducing Regular Grammars Using Recurrent Neural Networks

1 code implementation • 28 Oct 2017 • Mor Cohen, Avi Caciularu, Idan Rejwan, Jonathan Berant

Grammar induction is the task of learning a grammar from a set of examples.

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.