Search Results for author: Mrinmaya Sachan

Interpretability research aims to bridge the gap between the empirical success and our scientific understanding of the inner workings of large language models (LLMs).

Paper
Code

AFaCTA: Assisting the Annotation of Factual Claim Detection with Reliable LLM Annotators

no code implementations • 16 Feb 2024 • Jingwei Ni, Minjing Shi, Dominik Stammbach, Mrinmaya Sachan, Elliott Ash, Markus Leippold

With the rise of generative AI, automated fact-checking methods to combat misinformation are becoming more and more important.

Fact Checking Misinformation

Paper
Add Code

Scaling the Authoring of AutoTutors with Large Language Models

no code implementations • 14 Feb 2024 • Sankalan Pal Chowdhury, Vilém Zouhar, Mrinmaya Sachan

Large Language Models (LLMs) have found several use cases in education, ranging from automatic question generation to essay evaluation.

Math Question Generation +1

Paper
Add Code

Do Language Models Exhibit the Same Cognitive Biases in Problem Solving as Human Learners?

no code implementations • 31 Jan 2024 • Andreas Opedal, Alessandro Stolfo, Haruki Shirakami, Ying Jiao, Ryan Cotterell, Bernhard Schölkopf, Abulhair Saparov, Mrinmaya Sachan

We find evidence that LLMs, with and without instruction-tuning, exhibit human-like biases in both the text-comprehension and the solution-planning steps of the solving process, but not during the final step which relies on the problem's arithmetic expressions (solution execution).

Reading Comprehension

Paper
Add Code

CLadder: Assessing Causal Reasoning in Language Models

1 code implementation • NeurIPS 2023 • Zhijing Jin, Yuen Chen, Felix Leeb, Luigi Gresele, Ojasv Kamal, Zhiheng Lyu, Kevin Blin, Fernando Gonzalez Adauto, Max Kleiman-Weiner, Mrinmaya Sachan, Bernhard Schölkopf

Much of the existing work in natural language processing (NLP) focuses on evaluating commonsense causal reasoning in LLMs, thus failing to assess whether a model can perform causal inference in accordance with a set of well-defined formal rules.

Causal Inference Commonsense Causal Reasoning +1

Paper
Code

RELIC: Investigating Large Language Model Responses using Self-Consistency

no code implementations • 28 Nov 2023 • Furui Cheng, Vilém Zouhar, Simran Arora, Mrinmaya Sachan, Hendrik Strobelt, Mennatallah El-Assady

To address this challenge, we propose an interactive system that helps users gain insight into the reliability of the generated text.

Language Modelling Large Language Model

Paper
Add Code

Navigating the Ocean of Biases: Political Bias Attribution in Language Models via Causal Structures

1 code implementation • 15 Nov 2023 • David F. Jenny, Yann Billeter, Mrinmaya Sachan, Bernhard Schölkopf, Zhijing Jin

The rapid advancement of Large Language Models (LLMs) has sparked intense debate regarding their ability to perceive and interpret complex socio-political landscapes.

Decision Making

Paper
Code

The ART of LLM Refinement: Ask, Refine, and Trust

no code implementations • 14 Nov 2023 • Kumar Shridhar, Koustuv Sinha, Andrew Cohen, Tianlu Wang, Ping Yu, Ram Pasunuru, Mrinmaya Sachan, Jason Weston, Asli Celikyilmaz

In recent years, Large Language Models (LLMs) have demonstrated remarkable generative abilities, but can they judge the quality of their own generations?

Ranked #12 on Arithmetic Reasoning on GSM8K

Arithmetic Reasoning GSM8K +2

Paper
Add Code

CausalCite: A Causal Formulation of Paper Citations

1 code implementation • 5 Nov 2023 • Ishan Kumar, Zhijing Jin, Ehsan Mokhtarian, Siyuan Guo, Yuen Chen, Mrinmaya Sachan, Bernhard Schölkopf

Evaluating the significance of a paper is pivotal yet challenging for the scientific community.

Causal Inference counterfactual

Paper
Code

Towards a Mechanistic Interpretation of Multi-Step Reasoning Capabilities of Language Models

1 code implementation • 23 Oct 2023 • Yifan Hou, Jiaoda Li, Yu Fei, Alessandro Stolfo, Wangchunshu Zhou, Guangtao Zeng, Antoine Bosselut, Mrinmaya Sachan

We show that MechanisticProbe is able to detect the information of the reasoning tree from the model's attentions for most examples, suggesting that the LM indeed is going through a process of multi-step reasoning within its architecture in many cases.

Paper
Code

A Diachronic Perspective on User Trust in AI under Uncertainty

1 code implementation • 20 Oct 2023 • Shehzaad Dhuliawala, Vilém Zouhar, Mennatallah El-Assady, Mrinmaya Sachan

In a human-AI collaboration, users build a mental model of the AI system based on its reliability and how it presents its decision, e. g. its presentation of system confidence and an explanation of the output.

Paper
Code

Let's Synthesize Step by Step: Iterative Dataset Synthesis with Large Language Models by Extrapolating Errors from Small Models

1 code implementation • 20 Oct 2023 • Ruida Wang, Wangchunshu Zhou, Mrinmaya Sachan

*Data Synthesis* is a promising way to train a small model with very little labeled data.

Language Modelling Large Language Model

Paper
Code

Agents: An Open-source Framework for Autonomous Language Agents

1 code implementation • 14 Sep 2023 • Wangchunshu Zhou, Yuchen Eleanor Jiang, Long Li, Jialong Wu, Tiannan Wang, Shi Qiu, Jintian Zhang, Jing Chen, Ruipu Wu, Shuai Wang, Shiding Zhu, Jiyu Chen, Wentao Zhang, Xiangru Tang, Ningyu Zhang, Huajun Chen, Peng Cui, Mrinmaya Sachan

Recent advances on large language models (LLMs) enable researchers and developers to build autonomous language agents that can automatically solve various tasks and interact with environments, humans, and other agents using natural language interfaces.

4,494

Paper
Code

A Formal Perspective on Byte-Pair Encoding

1 code implementation • 29 Jun 2023 • Vilém Zouhar, Clara Meister, Juan Luis Gastaldi, Li Du, Tim Vieira, Mrinmaya Sachan, Ryan Cotterell

Via submodular functions, we prove that the iterative greedy version is a $\frac{1}{{\sigma(\boldsymbol{\mu}^\star)}}(1-e^{-{\sigma(\boldsymbol{\mu}^\star)}})$-approximation of an optimal merge sequence, where ${\sigma(\boldsymbol{\mu}^\star)}$ is the total backward curvature with respect to the optimal merge sequence $\boldsymbol{\mu}^\star$.

Combinatorial Optimization

Paper
Code

Tokenization and the Noiseless Channel

1 code implementation • 29 Jun 2023 • Vilém Zouhar, Clara Meister, Juan Luis Gastaldi, Li Du, Mrinmaya Sachan, Ryan Cotterell

Subword tokenization is a key part of many NLP pipelines.

Machine Translation

Paper
Code

Can Large Language Models Infer Causation from Correlation?

1 code implementation • 9 Jun 2023 • Zhijing Jin, Jiarui Liu, Zhiheng Lyu, Spencer Poff, Mrinmaya Sachan, Rada Mihalcea, Mona Diab, Bernhard Schölkopf

In this work, we propose the first benchmark dataset to test the pure causal inference skills of large language models (LLMs).

Causal Inference

Paper
Code

World Models for Math Story Problems

1 code implementation • 7 Jun 2023 • Andreas Opedal, Niklas Stoehr, Abulhair Saparov, Mrinmaya Sachan

In this paper, we consolidate previous work on categorizing and representing math story problems and develop MathWorld, which is a graph-based semantic formalism specific for the domain of math story problems.

Math

Paper
Code

Infusing Lattice Symmetry Priors in Attention Mechanisms for Sample-Efficient Abstract Geometric Reasoning

no code implementations • 5 Jun 2023 • Mattia Atzeni, Mrinmaya Sachan, Andreas Loukas

As a step towards this goal, we focus on geometry priors and introduce LatFormer, a model that incorporates lattice symmetry priors in attention masks.

Paper
Add Code

Adaptive and Personalized Exercise Generation for Online Language Learning

1 code implementation • 4 Jun 2023 • Peng Cui, Mrinmaya Sachan

We train and evaluate our model on real-world learner interaction data from Duolingo and demonstrate that LMs guided by student states can generate superior exercises.

Knowledge Tracing Text Generation

Paper
Code

Membership Inference Attacks against Language Models via Neighbourhood Comparison

1 code implementation • 29 May 2023 • Justus Mattern, FatemehSadat Mireshghallah, Zhijing Jin, Bernhard Schölkopf, Mrinmaya Sachan, Taylor Berg-Kirkpatrick

To investigate whether this fragility provides a layer of safety, we propose and evaluate neighbourhood attacks, which compare model scores for a given sample to scores of synthetically generated neighbour texts and therefore eliminate the need for access to the training data distribution.

Paper
Code

A Mechanistic Interpretation of Arithmetic Reasoning in Language Models using Causal Mediation Analysis

1 code implementation • 24 May 2023 • Alessandro Stolfo, Yonatan Belinkov, Mrinmaya Sachan

Mathematical reasoning in large language models (LMs) has garnered significant attention in recent work, but there is a limited understanding of how these models process and store information related to arithmetic tasks within their architecture.

Arithmetic Reasoning Mathematical Reasoning +2

Paper
Code

Linear-Time Modeling of Linguistic Structure: An Order-Theoretic Perspective

no code implementations • 24 May 2023 • Tianyu Liu, Afra Amini, Mrinmaya Sachan, Ryan Cotterell

We show that these exhaustive comparisons can be avoided, and, moreover, the complexity of such tasks can be reduced to linear by casting the relation between tokens as a partial order over the string.

coreference-resolution Dependency Parsing +1

Paper
Add Code

When Does Aggregating Multiple Skills with Multi-Task Learning Work? A Case Study in Financial NLP

2 code implementations • 23 May 2023 • Jingwei Ni, Zhijing Jin, Qian Wang, Mrinmaya Sachan, Markus Leippold

Due to the task difficulty and data scarcity in the Financial NLP domain, we explore when aggregating such diverse skills from multiple datasets with MTL can work.

Multi-Task Learning Open-Ended Question Answering +1

Paper
Code

All Roads Lead to Rome? Exploring the Invariance of Transformers' Representations

1 code implementation • 23 May 2023 • Yuxin Ren, Qipeng Guo, Zhijing Jin, Shauli Ravfogel, Mrinmaya Sachan, Bernhard Schölkopf, Ryan Cotterell

Transformer models bring propelling advances in various NLP tasks, thus inducing lots of interpretability research on the learned representations of the models.

Paper
Code

MathDial: A Dialogue Tutoring Dataset with Rich Pedagogical Properties Grounded in Math Reasoning Problems

1 code implementation • 23 May 2023 • Jakub Macina, Nico Daheim, Sankalan Pal Chowdhury, Tanmay Sinha, Manu Kapur, Iryna Gurevych, Mrinmaya Sachan

While automatic dialogue tutors hold great potential in making education personalized and more accessible, research on such systems has been hampered by a lack of sufficiently large and high-quality datasets.

Language Modelling Large Language Model +1

Paper
Code

RecurrentGPT: Interactive Generation of (Arbitrarily) Long Text

2 code implementations • 22 May 2023 • Wangchunshu Zhou, Yuchen Eleanor Jiang, Peng Cui, Tiannan Wang, Zhenxin Xiao, Yifan Hou, Ryan Cotterell, Mrinmaya Sachan

In addition to producing AI-generated content (AIGC), we also demonstrate the possibility of using RecurrentGPT as an interactive fiction that directly interacts with consumers.

Language Modelling Large Language Model

878

Paper
Code

Revisiting Automated Topic Model Evaluation with Large Language Models

1 code implementation • 20 May 2023 • Dominik Stammbach, Vilém Zouhar, Alexander Hoyle, Mrinmaya Sachan, Elliott Ash

Topic models are used to make sense of large text collections.

Topic Models

Paper
Code

Efficient Prompting via Dynamic In-Context Learning

no code implementations • 18 May 2023 • Wangchunshu Zhou, Yuchen Eleanor Jiang, Ryan Cotterell, Mrinmaya Sachan

To achieve this, we train a meta controller that predicts the number of in-context examples suitable for the generalist model to make a good prediction based on the performance-efficiency trade-off for a specific input.

In-Context Learning

Paper
Add Code

Discourse Centric Evaluation of Machine Translation with a Densely Annotated Parallel Corpus

1 code implementation • 18 May 2023 • Yuchen Eleanor Jiang, Tianyu Liu, Shuming Ma, Dongdong Zhang, Mrinmaya Sachan, Ryan Cotterell

Several recent papers claim human parity at sentence-level Machine Translation (MT), especially in high-resource languages.

Machine Translation Sentence +1

Paper
Code

Variational Classification

1 code implementation • 17 May 2023 • Shehzaad Dhuliawala, Mrinmaya Sachan, Carl Allen

We present a latent variable model for classification that provides a novel probabilistic interpretation of neural network softmax classifiers.

Adversarial Robustness text-classification +1

Paper
Code

Beyond Good Intentions: Reporting the Research Landscape of NLP for Social Good

1 code implementation • 9 May 2023 • Fernando Gonzalez, Zhijing Jin, Bernhard Schölkopf, Tom Hope, Mrinmaya Sachan, Rada Mihalcea

Using state-of-the-art NLP models, we address each of these tasks and use them on the entire ACL Anthology, resulting in a visualization workspace that gives researchers a comprehensive overview of the field of NLP4SG.

Paper
Code

Psychologically-Inspired Causal Prompts

1 code implementation • 2 May 2023 • Zhiheng Lyu, Zhijing Jin, Justus Mattern, Rada Mihalcea, Mrinmaya Sachan, Bernhard Schoelkopf

In this work, we take sentiment classification as an example and look into the causal relations between the review (X) and sentiment (Y).

Sentiment Analysis Sentiment Classification

Paper
Code

Controlled Text Generation with Natural Language Instructions

1 code implementation • 27 Apr 2023 • Wangchunshu Zhou, Yuchen Eleanor Jiang, Ethan Wilcox, Ryan Cotterell, Mrinmaya Sachan

Large language models generate fluent texts and can follow natural language instructions to solve a wide range of tasks without task-specific training.

In-Context Learning Language Modelling +1

Paper
Code

Enhancing Textbooks with Visuals from the Web for Improved Learning

1 code implementation • 18 Apr 2023 • Janvijay Singh, Vilém Zouhar, Mrinmaya Sachan

We release the dataset of textbooks with an associated image bank to inspire further research in this intersectional area of computer vision and NLP for education.

Math

Paper
Code

PWESuite: Phonetic Word Embeddings and Tasks They Facilitate

1 code implementation • 5 Apr 2023 • Vilém Zouhar, Kalvin Chang, Chenxuan Cui, Nathaniel Carlson, Nathaniel Robinson, Mrinmaya Sachan, David Mortensen

Mapping words into a fixed-dimensional vector space is the backbone of modern NLP.

Retrieval Word Embeddings

Paper
Code

Elastic Weight Removal for Faithful and Abstractive Dialogue Generation

1 code implementation • 30 Mar 2023 • Nico Daheim, Nouha Dziri, Mrinmaya Sachan, Iryna Gurevych, Edoardo M. Ponti

We evaluate our method -- using different variants of Flan-T5 as a backbone language model -- on multiple datasets for information-seeking dialogue generation and compare our method with state-of-the-art techniques for faithfulness, such as CTRL, Quark, DExperts, and Noisy Channel reranking.

Dialogue Generation Language Modelling

Paper
Code

Strategize Before Teaching: A Conversational Tutoring System with Pedagogy Self-Distillation

no code implementations • 27 Feb 2023 • Lingzhi Wang, Mrinmaya Sachan, Xingshan Zeng, Kam-Fai Wong

Conversational tutoring systems (CTSs) aim to help students master educational material with natural language interaction in the form of a dialog.

Response Generation

Paper
Add Code

Opportunities and Challenges in Neural Dialog Tutoring

1 code implementation • 24 Jan 2023 • Jakub Macina, Nico Daheim, Lingzhi Wang, Tanmay Sinha, Manu Kapur, Iryna Gurevych, Mrinmaya Sachan

Designing dialog tutors has been challenging as it involves modeling the diverse and complex pedagogical strategies employed by human tutors.

Paper
Code

Poor Man's Quality Estimation: Predicting Reference-Based MT Metrics Without the Reference

1 code implementation • 21 Jan 2023 • Vilém Zouhar, Shehzaad Dhuliawala, Wangchunshu Zhou, Nico Daheim, Tom Kocmi, Yuchen Eleanor Jiang, Mrinmaya Sachan

Machine translation quality estimation (QE) predicts human judgements of a translation hypothesis without seeing the reference.

Machine Translation Sentence +1

Paper
Code

Understanding Stereotypes in Language Models: Towards Robust Measurement and Zero-Shot Debiasing

no code implementations • 20 Dec 2022 • Justus Mattern, Zhijing Jin, Mrinmaya Sachan, Rada Mihalcea, Bernhard Schölkopf

Generated texts from large pretrained language models have been shown to exhibit a variety of harmful, human-like biases about various demographics.

Benchmarking

Paper
Add Code

Distilling Reasoning Capabilities into Smaller Language Models

1 code implementation • 1 Dec 2022 • Kumar Shridhar, Alessandro Stolfo, Mrinmaya Sachan

In this work, we propose an alternative reasoning scheme, Socratic CoT, that learns a decomposition of the original problem into a sequence of subproblems and uses it to guide the intermediate reasoning steps.

GSM8K Knowledge Distillation +2

Paper
Code

Automatic Generation of Socratic Subquestions for Teaching Math Word Problems

1 code implementation • 23 Nov 2022 • Kumar Shridhar, Jakub Macina, Mennatallah El-Assady, Tanmay Sinha, Manu Kapur, Mrinmaya Sachan

On both automatic and human quality evaluations, we find that LMs constrained with desirable question properties generate superior questions and improve the overall performance of a math word problem solver.

Math Math Word Problem Solving +2

Paper
Code

Beyond Prompting: Making Pre-trained Language Models Better Zero-shot Learners by Clustering Representations

1 code implementation • 29 Oct 2022 • Yu Fei, Ping Nie, Zhao Meng, Roger Wattenhofer, Mrinmaya Sachan

We further explore the applicability of our clustering approach by evaluating it on 14 datasets with more diverse topics, text lengths, and numbers of classes.

Clustering Sentence +7

Paper
Code

Autoregressive Structured Prediction with Language Models

1 code implementation • 26 Oct 2022 • Tianyu Liu, Yuchen Jiang, Nicholas Monath, Ryan Cotterell, Mrinmaya Sachan

Recent years have seen a paradigm shift in NLP towards using pretrained language models ({PLM}) for a wide range of tasks.

Ranked #1 on Relation Extraction on CoNLL04 (RE+ Micro F1 metric)

Named Entity Recognition Named Entity Recognition (NER) +2

Paper
Code

A Bilingual Parallel Corpus with Discourse Annotations

1 code implementation • 26 Oct 2022 • Yuchen Eleanor Jiang, Tianyu Liu, Shuming Ma, Dongdong Zhang, Mrinmaya Sachan, Ryan Cotterell

The BWB corpus consists of Chinese novels translated by experts into English, and the annotated test set is designed to probe the ability of machine translation systems to model various discourse phenomena.

Document Level Machine Translation Machine Translation +2

Paper
Code

Investigating the Role of Centering Theory in the Context of Neural Coreference Resolution Systems

no code implementations • 26 Oct 2022 • Yuchen Eleanor Jiang, Ryan Cotterell, Mrinmaya Sachan

Our analysis further shows that contextualized embeddings contain much of the coherence information, which helps explain why CT can only provide little gains to modern neural coreference resolvers which make use of pretrained representations.

coreference-resolution World Knowledge

Paper
Add Code

Differentially Private Language Models for Secure Data Sharing

no code implementations • 25 Oct 2022 • Justus Mattern, Zhijing Jin, Benjamin Weggenmann, Bernhard Schoelkopf, Mrinmaya Sachan

To protect the privacy of individuals whose data is being shared, it is of high importance to develop methods allowing researchers and companies to release textual data while providing formal privacy guarantees to its originators.

Language Modelling

Paper
Add Code

Adapters for Enhanced Modeling of Multilingual Knowledge and Text

1 code implementation • 24 Oct 2022 • Yifan Hou, Wenxiang Jiao, Meizhen Liu, Carl Allen, Zhaopeng Tu, Mrinmaya Sachan

Specifically, we introduce a lightweight adapter set to enhance MLLMs with cross-lingual entity alignment and facts from MLKGs for many languages.

Entity Alignment

Paper
Code

A Causal Framework to Quantify the Robustness of Mathematical Reasoning with Language Models

1 code implementation • 21 Oct 2022 • Alessandro Stolfo, Zhijing Jin, Kumar Shridhar, Bernhard Schölkopf, Mrinmaya Sachan

By grounding the behavioral analysis in a causal graph describing an intuitive reasoning process, we study the behavior of language models in terms of robustness and sensitivity to direct interventions in the input space.

Math Mathematical Reasoning

Paper
Code

Longtonotes: OntoNotes with Longer Coreference Chains

1 code implementation • 7 Oct 2022 • Kumar Shridhar, Nicholas Monath, Raghuveer Thirukovalluru, Alessandro Stolfo, Manzil Zaheer, Andrew McCallum, Mrinmaya Sachan

Ontonotes has served as the most important benchmark for coreference resolution.

coreference-resolution

Paper
Code

When to Make Exceptions: Exploring Language Models as Accounts of Human Moral Judgment

1 code implementation • 4 Oct 2022 • Zhijing Jin, Sydney Levine, Fernando Gonzalez, Ojasv Kamal, Maarten Sap, Mrinmaya Sachan, Rada Mihalcea, Josh Tenenbaum, Bernhard Schölkopf

Using a state-of-the-art large language model (LLM) as a basis, we propose a novel moral chain of thought (MORALCOT) prompting strategy that combines the strengths of LLMs with theories of moral reasoning developed in cognitive science to predict human moral judgments.

Language Modelling Large Language Model +1

Paper
Code

Learning to Drop Out: An Adversarial Approach to Training Sequence VAEs

no code implementations • 26 Sep 2022 • Đorđe Miladinović, Kumar Shridhar, Kushal Jain, Max B. Paulus, Joachim M. Buhmann, Mrinmaya Sachan, Carl Allen

In principle, applying variational autoencoders (VAEs) to sequential data offers a method for controlled sequence generation, manipulation, and structured representation learning.

Representation Learning

Paper
Add Code

Probing via Prompting

1 code implementation • NAACL 2022 • Jiaoda Li, Ryan Cotterell, Mrinmaya Sachan

We then examine the usefulness of a specific linguistic property for pre-training by removing the heads that are essential to that property and evaluating the resulting model's performance on language modeling.

Language Modelling

Paper
Code

A Structured Span Selector

1 code implementation • NAACL 2022 • Tianyu Liu, Yuchen Eleanor Jiang, Ryan Cotterell, Mrinmaya Sachan

Many natural language processing tasks, e. g., coreference resolution and semantic role labeling, require selecting text spans and making decisions about them.

coreference-resolution Inductive Bias +1

Paper
Code

Original or Translated? A Causal Analysis of the Impact of Translationese on Machine Translation Performance

1 code implementation • NAACL 2022 • Jingwei Ni, Zhijing Jin, Markus Freitag, Mrinmaya Sachan, Bernhard Schölkopf

We show that these two factors have a large causal effect on the MT performance, in addition to the test-model direction mismatch highlighted by existing work on the impact of translationese.

Machine Translation Translation

Paper
Code

Calibration of Machine Reading Systems at Scale

no code implementations • Findings (ACL) 2022 • Shehzaad Dhuliawala, Leonard Adolphs, Rajarshi Das, Mrinmaya Sachan

We show that calibrating such complex systems which contain discrete retrieval and deep reading components is challenging and current calibration techniques fail to scale to these settings.

Claim Verification Open-Domain Question Answering +2

Paper
Add Code

Slangvolution: A Causal Analysis of Semantic Change and Frequency Dynamics in Slang

1 code implementation • ACL 2022 • Daphna Keidar, Andreas Opedal, Zhijing Jin, Mrinmaya Sachan

We analyze the semantic change and frequency shift of slang words and compare them to those of standard, nonslang words.

Causal Discovery Causal Inference

Paper
Code

Logical Fallacy Detection

2 code implementations • 28 Feb 2022 • Zhijing Jin, Abhinav Lalwani, Tejas Vaidhya, Xiaoyu Shen, Yiwen Ding, Zhiheng Lyu, Mrinmaya Sachan, Rada Mihalcea, Bernhard Schölkopf

In this paper, we propose the task of logical fallacy detection, and provide a new dataset (Logic) of logical fallacies generally found in text, together with an additional challenge set for detecting logical fallacies in climate change claims (LogicClimate).

Language Modelling Logical Fallacies +2

Paper
Code

What Has Been Enhanced in my Knowledge-Enhanced Language Model?

1 code implementation • 2 Feb 2022 • Yifan Hou, Guoji Fu, Mrinmaya Sachan

We conduct experiments to verify that our GCS can indeed be used to correctly interpret the KI process, and we use it to analyze two well-known knowledge-enhanced LMs: ERNIE and K-Adapter, and find that only a small amount of factual knowledge is integrated in them.

Graph Attention Language Modelling

Paper
Code

A Hybrid Neuro-Symbolic approach for Text-Based Games using Inductive Logic Programming

no code implementations • AAAI Workshop CLeaR 2022 • Kinjal Basu, Keerthiram Murugesan, Mattia Atzeni, Pavan Kapanipathi, Kartik Talamadupula, Tim Klinger, Murray Campbell, Mrinmaya Sachan, Gopal Gupta

These rules are learned in an online manner and applied with an ASP solver to predict an action for the agent.

Inductive logic programming Natural Language Understanding +2

Paper
Add Code

Case-based Reasoning for Better Generalization in Textual Reinforcement Learning

no code implementations • ICLR 2022 • Mattia Atzeni, Shehzaad Dhuliawala, Keerthiram Murugesan, Mrinmaya Sachan

Text-based games (TBG) have emerged as promising environments for driving research in grounded language understanding and studying problems like generalization and sample efficiency.

Out-of-Distribution Generalization reinforcement-learning +2

Paper
Add Code

On Learning the Transformer Kernel

1 code implementation • 15 Oct 2021 • Sankalan Pal Chowdhury, Adamos Solomou, Avinava Dubey, Mrinmaya Sachan

In this work we introduce KERNELIZED TRANSFORMER, a generic, scalable, data driven framework for learning the kernel function in Transformers.

Computational Efficiency

Paper
Code

Causal Direction of Data Collection Matters: Implications of Causal and Anticausal Learning for NLP

1 code implementation • EMNLP 2021 • Zhijing Jin, Julius von Kügelgen, Jingwei Ni, Tejas Vaidhya, Ayush Kaushal, Mrinmaya Sachan, Bernhard Schölkopf

The principle of independent causal mechanisms (ICM) states that generative processes of real world data consist of independent modules which do not influence or inform each other.

Causal Inference Domain Adaptation

Paper
Code

"Let Your Characters Tell Their Story": A Dataset for Character-Centric Narrative Understanding

no code implementations • 12 Sep 2021 • Faeze Brahman, Meng Huang, Oyvind Tafjord, Chao Zhao, Mrinmaya Sachan, Snigdha Chaturvedi

When reading a literary piece, readers often make inferences about various characters' roles, personalities, relationships, intents, actions, etc.

Paper
Add Code

Differentiable Subset Pruning of Transformer Heads

2 code implementations • 10 Aug 2021 • Jiaoda Li, Ryan Cotterell, Mrinmaya Sachan

Multi-head attention, a collection of several attention mechanisms that independently attend to different parts of the input, is the key ingredient in the Transformer.

Machine Translation Natural Language Inference +1

Paper
Code

Efficient Text-based Reinforcement Learning by Jointly Leveraging State and Commonsense Graph Representations

no code implementations • ACL 2021 • Keerthiram Murugesan, Mattia Atzeni, Pavan Kapanipathi, Kartik Talamadupula, Mrinmaya Sachan, Murray Campbell

Text-based games (TBGs) have emerged as useful benchmarks for evaluating progress at the intersection of grounded language understanding and reinforcement learning (RL).

Graph Attention Reinforcement Learning (RL) +1

Paper
Add Code

Self-Supervised Contrastive Learning with Adversarial Perturbations for Defending Word Substitution-based Attacks

1 code implementation • Findings (NAACL) 2022 • Zhao Meng, Yihan Dong, Mrinmaya Sachan, Roger Wattenhofer

In this paper, we present an approach to improve the robustness of BERT language models against word substitution-based adversarial attacks by leveraging adversarial perturbations for self-supervised contrastive learning.

Adversarial Attack Contrastive Learning +1

Paper
Code

How Good Is NLP? A Sober Look at NLP Tasks through the Lens of Social Impact

2 code implementations • Findings (ACL) 2021 • Zhijing Jin, Geeticka Chauhan, Brian Tse, Mrinmaya Sachan, Rada Mihalcea

We lay the foundations via the moral philosophy definition of social good, propose a framework to evaluate the direct and indirect real-world impact of NLP tasks, and adopt the methodology of global priorities research to identify priority causes for NLP research.

Philosophy

264

Paper
Code

Bird's Eye: Probing for Linguistic Graph Structures with a Simple Information-Theoretic Approach

1 code implementation • ACL 2021 • Yifan Hou, Mrinmaya Sachan

However, due to the inter-dependence of various phenomena and randomness of training probe models, detecting how these representations encode the rich information in these linguistic graphs remains a challenging problem.

Paper
Code

BlonDe: An Automatic Evaluation Metric for Document-level Machine Translation

2 code implementations • NAACL 2022 • Yuchen Eleanor Jiang, Tianyu Liu, Shuming Ma, Dongdong Zhang, Jian Yang, Haoyang Huang, Rico Sennrich, Ryan Cotterell, Mrinmaya Sachan, Ming Zhou

Standard automatic metrics, e. g. BLEU, are not reliable for document-level MT evaluation.

Document Level Machine Translation Machine Translation +2

Paper
Code

Deep Clustering of Text Representations for Supervision-free Probing of Syntax

no code implementations • 24 Oct 2020 • Vikram Gupta, Haoyue Shi, Kevin Gimpel, Mrinmaya Sachan

We explore deep clustering of text representations for unsupervised model interpretation and induction of syntax.

Clustering Deep Clustering +1

Paper
Add Code

Stronger Transformers for Neural Multi-Hop Question Generation

no code implementations • 22 Oct 2020 • Devendra Singh Sachan, Lingfei Wu, Mrinmaya Sachan, William Hamilton

In this work, we introduce a series of strong transformer models for multi-hop question generation, including a graph-augmented transformer that leverages relations between entities in the text.

Question Generation Question-Generation

Paper
Add Code

Text-based RL Agents with Commonsense Knowledge: New Challenges, Environments and Baselines

2 code implementations • 8 Oct 2020 • Keerthiram Murugesan, Mattia Atzeni, Pavan Kapanipathi, Pushkar Shukla, Sadhana Kumaravel, Gerald Tesauro, Kartik Talamadupula, Mrinmaya Sachan, Murray Campbell

Text-based games have emerged as an important test-bed for Reinforcement Learning (RL) research, requiring RL agents to combine grounded language understanding with sequential decision making.

Ranked #1 on Commonsense Reasoning for RL on commonsense-rl

Common Sense Reasoning Commonsense Reasoning for RL +3

Paper
Code

Text-based RL Agents with Commonsense Knowledge: New Challenges, Environments and Approaches

no code implementations • 12 Jul 2020 • Keerthiram Murugesan, Mattia Atzeni, Pavan Kapanipathi, Pushkar Shukla, Sadhana Kumaravel, Gerald Tesauro, Kartik Talamadupula, Mrinmaya Sachan, Murray Campbell

We introduce a number of RL agents that combine the sequential context with a dynamic graph representation of their beliefs of the world and commonsense knowledge from ConceptNet in different ways.

Decision Making Reinforcement Learning (RL) +1

Paper
Add Code

Knowledge Graph Embedding Compression

no code implementations • ACL 2020 • Mrinmaya Sachan

Knowledge graph (KG) representation learning techniques that learn continuous embeddings of entities and relations in the KG have become popular in many AI applications.

Knowledge Graph Embedding Representation Learning

Paper
Add Code

Enhancing Text-based Reinforcement Learning Agents with Commonsense Knowledge

no code implementations • 2 May 2020 • Keerthiram Murugesan, Mattia Atzeni, Pushkar Shukla, Mrinmaya Sachan, Pavan Kapanipathi, Kartik Talamadupula

In this paper, we consider the recent trend of evaluating progress on reinforcement learning technology by using text-based environments and games as evaluation environments.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Discourse in Multimedia: A Case Study in Extracting Geometry Knowledge from Textbooks

no code implementations • CL 2019 • Mrinmaya Sachan, Avinava Dubey, Eduard H. Hovy, Tom M. Mitchell, Dan Roth, Eric P. Xing

At the same time, these help the readers pick up the structure of the discourse and comprehend the conveyed information.

Paper
Add Code

Learning Pipelines with Limited Data and Domain Knowledge: A Study in Parsing Physics Problems

no code implementations • NeurIPS 2018 • Mrinmaya Sachan, Kumar Avinava Dubey, Tom M. Mitchell, Dan Roth, Eric P. Xing

Finally, we also show how Nuts&Bolts can be used to achieve improvements on a relation extraction task and on the end task of answering Newtonian physics problems.

BIG-bench Machine Learning Relation Extraction

Paper
Add Code

Discourse in Multimedia: A Case Study in Information Extraction

no code implementations • 13 Nov 2018 • Mrinmaya Sachan, Kumar Avinava Dubey, Eduard H. Hovy, Tom M. Mitchell, Dan Roth, Eric P. Xing

At the same time, these help the readers pick up the structure of the discourse and comprehend the conveyed information.

Paper
Add Code

Contextual Parameter Generation for Universal Neural Machine Translation

1 code implementation • EMNLP 2018 • Emmanouil Antonios Platanios, Mrinmaya Sachan, Graham Neubig, Tom Mitchell

We propose a simple modification to existing neural machine translation (NMT) models that enables using a single universal model to translate between multiple languages while allowing for language specific parameterization, and that can also be used for domain adaptation.

Domain Adaptation Machine Translation +2

Paper
Code

Self-Training for Jointly Learning to Ask and Answer Questions

no code implementations • NAACL 2018 • Mrinmaya Sachan, Eric Xing

The two tasks of question answering and question generation are usually tackled separately in the NLP literature.

Data Augmentation Question Answering +3

Paper
Add Code

Effective Use of Bidirectional Language Modeling for Transfer Learning in Biomedical Named Entity Recognition

2 code implementations • 21 Nov 2017 • Devendra Singh Sachan, Pengtao Xie, Mrinmaya Sachan, Eric P. Xing

We also show that BiLM weight transfer leads to a faster model training and the pretrained model requires fewer training examples to achieve a particular F1 score.

Language Modelling named-entity-recognition +3