1 code implementation • 15 Apr 2024 • Adi Simhi, Jonathan Herzig, Idan Szpektor, Yonatan Belinkov
In this work, we first introduce an approach for constructing datasets based on the model knowledge for detection and intervention methods in closed-book and open-book question-answering settings.
no code implementations • 15 Feb 2024 • Shashwat Singh, Shauli Ravfogel, Jonathan Herzig, Roee Aharoni, Ryan Cotterell, Ponnurangam Kumaraguru
We demonstrate the effectiveness of the proposed approaches in mitigating bias in multiclass classification and in reducing the generation of toxic language, outperforming strong baselines.
no code implementations • 1 Feb 2024 • Alon Jacovi, Yonatan Bitton, Bernd Bohnet, Jonathan Herzig, Or Honovich, Michael Tseng, Michael Collins, Roee Aharoni, Mor Geva
REVEAL includes comprehensive labels for the relevance, attribution to evidence passages, and logical correctness of each reasoning step in a language model's answer, across a variety of datasets and state-of-the-art language models.
no code implementations • 3 Jan 2024 • Uri Shaham, Jonathan Herzig, Roee Aharoni, Idan Szpektor, Reut Tsarfaty, Matan Eyal
As instruction-tuned large language models (LLMs) gain global adoption, their ability to follow instructions in multiple languages becomes increasingly crucial.
no code implementations • 16 Oct 2023 • Alon Jacovi, Avi Caciularu, Jonathan Herzig, Roee Aharoni, Bernd Bohnet, Mor Geva
A growing area of research investigates augmenting language models with tools (e. g., search engines, calculators) to overcome their shortcomings (e. g., missing or incorrect knowledge, incorrect logical inferences).
no code implementations • 23 May 2023 • Benjamin Muller, John Wieting, Jonathan H. Clark, Tom Kwiatkowski, Sebastian Ruder, Livio Baldini Soares, Roee Aharoni, Jonathan Herzig, Xinyi Wang
Based on these models, we improve the attribution level of a cross-lingual question-answering system.
1 code implementation • 18 May 2023 • Zorik Gekhman, Jonathan Herzig, Roee Aharoni, Chen Elkind, Idan Szpektor
Factual consistency evaluation is often conducted using Natural Language Inference (NLI) models, yet these models exhibit limited success in evaluating summaries.
1 code implementation • NeurIPS 2023 • Michal Yarom, Yonatan Bitton, Soravit Changpinyo, Roee Aharoni, Jonathan Herzig, Oran Lang, Eran Ofek, Idan Szpektor
Automatically determining whether a text and a corresponding image are semantically aligned is a significant challenge for vision-language models, with applications in generative text-to-image and image-to-text tasks.
Ranked #11 on Visual Reasoning on Winoground
no code implementations • 20 Dec 2022 • Roee Aharoni, Shashi Narayan, Joshua Maynez, Jonathan Herzig, Elizabeth Clark, Mirella Lapata
Abstractive summarization has enjoyed renewed interest in recent years, thanks to pre-trained language models and the availability of large-scale datasets.
1 code implementation • 15 Dec 2022 • Bernd Bohnet, Vinh Q. Tran, Pat Verga, Roee Aharoni, Daniel Andor, Livio Baldini Soares, Massimiliano Ciaramita, Jacob Eisenstein, Kuzman Ganchev, Jonathan Herzig, Kai Hui, Tom Kwiatkowski, Ji Ma, Jianmo Ni, Lierni Sestorain Saralegui, Tal Schuster, William W. Cohen, Michael Collins, Dipanjan Das, Donald Metzler, Slav Petrov, Kellie Webster
We take human annotations as a gold standard and show that a correlated automatic metric is suitable for development.
no code implementations • 25 May 2022 • Samuel Joseph Amouyal, Tomer Wolfson, Ohad Rubin, Ori Yoran, Jonathan Herzig, Jonathan Berant
Our results highlight the need for developing ODQA models that handle a broad range of question types, including single and multi-answer questions.
no code implementations • 24 May 2022 • Linlu Qiu, Peter Shaw, Panupong Pasupat, Tianze Shi, Jonathan Herzig, Emily Pitler, Fei Sha, Kristina Toutanova
Meanwhile, recent work has shown considerable improvements on many NLP tasks from model scaling.
1 code implementation • NAACL 2022 • Or Honovich, Roee Aharoni, Jonathan Herzig, Hagai Taitelbaum, Doron Kukliansy, Vered Cohen, Thomas Scialom, Idan Szpektor, Avinatan Hassidim, Yossi Matias
Grounded text generation systems often generate text that contains factual inconsistencies, hindering their real-world applicability.
2 code implementations • NAACL 2022 • Ohad Rubin, Jonathan Herzig, Jonathan Berant
In-context learning is a recent paradigm in natural language understanding, where a large pre-trained language model (LM) observes a test instance and a few training examples as its input, and directly decodes the output without any update to its parameters.
1 code implementation • EMNLP 2021 • Inbar Oren, Jonathan Herzig, Jonathan Berant
We evaluate our approach on a new split of the schema2QA dataset, and show that it leads to dramatic improvements in compositional generalization as well as moderate improvements in the traditional i. i. d setup.
2 code implementations • 15 Apr 2021 • Jonathan Herzig, Peter Shaw, Ming-Wei Chang, Kelvin Guu, Panupong Pasupat, Yuan Zhang
Sequence-to-sequence (seq2seq) models are prevalent in semantic parsing, but have been found to struggle at out-of-distribution compositional generalization.
Ranked #3 on Semantic Parsing on CFQ
1 code implementation • NAACL 2021 • Jonathan Herzig, Thomas Müller, Syrine Krichene, Julian Martin Eisenschlos
Recent advances in open-domain QA have led to strong models based on dense retrieval, but only focused on retrieving textual passages.
1 code implementation • Findings of the Association for Computational Linguistics 2020 • Inbar Oren, Jonathan Herzig, Nitish Gupta, Matt Gardner, Jonathan Berant
Generalization of models to out-of-distribution (OOD) data has captured tremendous attention recently.
1 code implementation • ACL 2021 • Jonathan Herzig, Jonathan Berant
Despite the success of sequence-to-sequence (seq2seq) models in semantic parsing, recent work has shown that they fail in compositional generalization, i. e., the ability to generalize to new structures built of components observed during training.
5 code implementations • ACL 2020 • Jonathan Herzig, Paweł Krzysztof Nowak, Thomas Müller, Francesco Piccinno, Julian Martin Eisenschlos
In this paper, we present TAPAS, an approach to question answering over tables without generating logical forms.
Ranked #1 on Semantic Parsing on SQA (Accuracy metric)
no code implementations • IJCNLP 2019 • Shai Erera, Michal Shmueli-Scheuer, Guy Feigenblat, Ora Peled Nakash, Odellia Boni, Haggai Roitman, Doron Cohen, Bar Weiner, Yosi Mass, Or Rivlin, Guy Lev, Achiya Jerbi, Jonathan Herzig, Yufang Hou, Charles Jochim, Martin Gleize, Francesca Bonin, David Konopnicki
We present a novel system providing summaries for Computer Science publications.
1 code implementation • IJCNLP 2019 • Jonathan Herzig, Jonathan Berant
Assuming access to unlabeled utterances from the true distribution, we combine crowdsourcing with a paraphrase model to detect correct logical forms for the unlabeled utterances.
1 code implementation • ACL 2019 • Guy Lev, Michal Shmueli-Scheuer, Jonathan Herzig, Achiya Jerbi, David Konopnicki
We collected 1716 papers and their corresponding videos, and created a dataset of paper summaries.
no code implementations • SEMEVAL 2019 • Jonathan Herzig, S, Tommy bank, Michal Shmueli-Scheuer, David Konopnicki
Chatbots (i. e., bots) are becoming widely used in multiple domains, along with supporting bot programming platforms.
3 code implementations • NAACL 2019 • Alon Talmor, Jonathan Herzig, Nicholas Lourie, Jonathan Berant
To investigate question answering with prior knowledge, we present CommonsenseQA: a challenging new dataset for commonsense question answering.
Ranked #30 on Common Sense Reasoning on CommonsenseQA (using extra training data)
1 code implementation • NAACL 2019 • Dor Muhlgay, Jonathan Herzig, Jonathan Berant
Training models to map natural language instructions to programs given target world supervision only requires searching for good programs at training time.
1 code implementation • EMNLP 2018 • Jonathan Herzig, Jonathan Berant
Building a semantic parser quickly in a new domain is a fundamental challenge for conversational interfaces, as current semantic parsers require expensive supervision and lack the ability to generalize to new domains.
no code implementations • NAACL 2018 • Tommy Sandbank, Michal Shmueli-Scheuer, Jonathan Herzig, David Konopnicki, John Richards, David Piorkowski
In this paper, we outline an approach to detecting such egregious conversations, using behavioral cues from the user, patterns in agent responses, and user-agent interaction.
no code implementations • WS 2017 • Jonathan Herzig, Michal Shmueli-Scheuer, S, Tommy bank, David Konopnicki
We present a neural response generation model that generates responses conditioned on a target personality.
1 code implementation • ACL 2017 • Jonathan Herzig, Jonathan Berant
A fundamental challenge in developing semantic parsers is the paucity of strong supervision in the form of language utterances annotated with logical form.