Search Results for author: Niranjan Balasubramanian

Found 54 papers, 28 papers with code

IrEne-viz: Visualizing Energy Consumption of Transformer Models

1 code implementation • EMNLP (ACL) 2021 • Yash Kumar Lal, Reetu Singh, Harsh Trivedi, Qingqing Cao, Aruna Balasubramanian, Niranjan Balasubramanian

IrEne is an energy prediction system that accurately predicts the interpretable inference energy consumption of a wide range of Transformer-based NLP models.

303

Paper
Code

SpecNFS: A Challenge Dataset Towards Extracting Formal Models from Natural Language Specifications

1 code implementation • LREC 2022 • Sayontan Ghosh, Amanpreet Singh, Alex Merenstein, Wei Su, Scott A. Smolka, Erez Zadok, Niranjan Balasubramanian

Evaluations show that even when using a state-of-the-art language model, there is significant room for improvement, with the best models achieving an F1 score of only 60. 5 and 33. 3 in the named-entity-recognition and dependency-link-prediction sub-tasks, respectively.

Dependency Parsing Domain Adaptation +7

Paper
Code

Comparing Pre-trained Human Language Models: Is it Better with Human Context as Groups, Individual Traits, or Both?

no code implementations • 23 Jan 2024 • Nikita Soni, Niranjan Balasubramanian, H. Andrew Schwartz, Dirk Hovy

We compare pre-training models with human context via 1) group attributes, 2) individual users, and 3) a combined approach on 5 user- and document-level tasks.

Age Estimation Language Modelling

Paper
Add Code

Large Human Language Models: A Need and the Challenges

no code implementations • 9 Nov 2023 • Nikita Soni, H. Andrew Schwartz, João Sedoc, Niranjan Balasubramanian

As research in human-centered NLP advances, there is a growing recognition of the importance of incorporating human and social factors into NLP models.

Paper
Add Code

Modeling Complex Event Scenarios via Simple Entity-focused Questions

1 code implementation • 14 Feb 2023 • Mahnaz Koupaee, Greg Durrett, Nathanael Chambers, Niranjan Balasubramanian

Event scenarios are often complex and involve multiple event sequences connected through different entity participants.

Language Modelling

Paper
Code

Interleaving Retrieval with Chain-of-Thought Reasoning for Knowledge-Intensive Multi-Step Questions

1 code implementation • 20 Dec 2022 • Harsh Trivedi, Niranjan Balasubramanian, Tushar Khot, Ashish Sabharwal

While using the question to retrieve relevant text from an external knowledge source helps LLMs, we observe that this one-step retrieve-and-read approach is insufficient for multi-step QA.

Hallucination Question Answering +1

Paper
Code

POQue: Asking Participant-specific Outcome Questions for a Deeper Understanding of Complex Events

1 code implementation • 5 Dec 2022 • Sai Vallurupalli, Sayontan Ghosh, Katrin Erk, Niranjan Balasubramanian, Francis Ferraro

Knowledge about outcomes is critical for complex event understanding but is hard to acquire.

Paper
Code

BioNLI: Generating a Biomedical NLI Dataset Using Lexico-semantic Constraints for Adversarial Examples

1 code implementation • 26 Oct 2022 • Mohaddeseh Bastan, Mihai Surdeanu, Niranjan Balasubramanian

We introduce a novel semi-supervised procedure that bootstraps an NLI dataset from existing biomedical dataset that pairs mechanisms with experimental evidence in abstracts.

Ranked #1 on Natural Language Inference on BioNLI

Decision Making Natural Language Inference

Paper
Code

Text-Derived Knowledge Helps Vision: A Simple Cross-modal Distillation for Video-based Action Anticipation

1 code implementation • 12 Oct 2022 • Sayontan Ghosh, Tanvi Aggarwal, Minh Hoai, Niranjan Balasubramanian

Anticipating future actions in a video is useful for many autonomous and assistive technologies.

Action Anticipation Transfer Learning

Paper
Code

Efficient Methods for Natural Language Processing: A Survey

no code implementations • 31 Aug 2022 • Marcos Treviso, Ji-Ung Lee, Tianchu Ji, Betty van Aken, Qingqing Cao, Manuel R. Ciosici, Michael Hassid, Kenneth Heafield, Sara Hooker, Colin Raffel, Pedro H. Martins, André F. T. Martins, Jessica Zosa Forde, Peter Milder, Edwin Simpson, Noam Slonim, Jesse Dodge, Emma Strubell, Niranjan Balasubramanian, Leon Derczynski, Iryna Gurevych, Roy Schwartz

Recent work in natural language processing (NLP) has yielded appealing results from scaling model parameters and training data; however, using only scale to improve performance means that resource consumption also grows.

Information Retrieval Open-Domain Question Answering

Paper
Add Code

PASTA: A Dataset for Modeling Participant States in Narratives

no code implementations • 31 Jul 2022 • Sayontan Ghosh, Mahnaz Koupaee, Isabella Chen, Francis Ferraro, Nathanael Chambers, Niranjan Balasubramanian

This dataset contains inferable participant states; a counterfactual perturbation to each state; and the changes to the story that would be necessary if the counterfactual were true.

Benchmarking Common Sense Reasoning +1

Paper
Add Code

Teaching Broad Reasoning Skills for Multi-Step QA by Generating Hard Contexts

1 code implementation • 25 May 2022 • Harsh Trivedi, Niranjan Balasubramanian, Tushar Khot, Ashish Sabharwal

We show how to use question decompositions to teach language models these broad reasoning skills in a robust fashion.

Question Answering

Paper
Code

Enhancing Continual Learning with Global Prototypes: Counteracting Negative Representation Drift

no code implementations • 24 May 2022 • Xueying Bai, Jinghuan Shang, Yifan Sun, Niranjan Balasubramanian

Continual learning (CL) aims to learn a sequence of tasks over time, with data distributions shifting from one task to another.

Continual Learning Language Modelling +1

Paper
Add Code

SuMe: A Dataset Towards Summarizing Biomedical Mechanisms

2 code implementations • ACL ARR November 2021 • Mohaddeseh Bastan, Nishant Shankar, Mihai Surdeanu, Niranjan Balasubramanian

We leverage this structure and create a summarization task, where the input is a collection of sentences and the main entities in an abstract, and the output includes the relationship and a sentence that summarizes the mechanism.

Sentence

Paper
Code

Human Language Modeling

1 code implementation • Findings (ACL) 2022 • Nikita Soni, Matthew Matero, Niranjan Balasubramanian, H. Andrew Schwartz

Natural language is generated by people, yet traditional language modeling views words or documents as if generated independently.

Age Estimation Language Modelling +3

Paper
Code

MeLT: Message-Level Transformer with Masked Document Representations as Pre-Training for Stance Detection

1 code implementation • Findings (EMNLP) 2021 • Matthew Matero, Nikita Soni, Niranjan Balasubramanian, H. Andrew Schwartz

Much of natural language processing is focused on leveraging large capacity language models, typically trained over single messages with a task of predicting one or more tokens.

Attribute Language Modelling +2

Paper
Code

Summarize-then-Answer: Generating Concise Explanations for Multi-hop Reading Comprehension

1 code implementation • EMNLP 2021 • Naoya Inoue, Harsh Trivedi, Steven Sinha, Niranjan Balasubramanian, Kentaro Inui

Instead, we advocate for an abstractive approach, where we propose to generate a question-focused, abstractive summary of input paragraphs and then feed it to an RC system.

2k Multi-Hop Reading Comprehension

Paper
Code

MuSiQue: Multihop Questions via Single-hop Question Composition

1 code implementation • 2 Aug 2021 • Harsh Trivedi, Niranjan Balasubramanian, Tushar Khot, Ashish Sabharwal

Multihop reasoning remains an elusive goal as existing multihop benchmarks are known to be largely solvable via shortcuts.

Multi-hop Question Answering Question Answering

Paper
Code

Don't Let Discourse Confine Your Model: Sequence Perturbations for Improved Event Language Models

no code implementations • ACL 2021 • Mahnaz Koupaee, Greg Durrett, Nathanael Chambers, Niranjan Balasubramanian

Event language models represent plausible sequences of events.

Paper
Add Code

Toward Diverse Precondition Generation

no code implementations • Joint Conference on Lexical and Computational Semantics 2021 • Heeyoung Kwon, Nathanael Chambers, Niranjan Balasubramanian

We propose DiP, a Diverse Precondition generation system that can generate unique and diverse preconditions.

Paper
Add Code

TellMeWhy: A Dataset for Answering Why-Questions in Narratives

1 code implementation • Findings (ACL) 2021 • Yash Kumar Lal, Nathanael Chambers, Raymond Mooney, Niranjan Balasubramanian

They are especially worse on questions whose answers are external to the narrative, thus providing a challenge for future QA and narrative understanding research.

Paper
Code

On the Distribution, Sparsity, and Inference-time Quantization of Attention Values in Transformers

no code implementations • Findings (ACL) 2021 • Tianchu Ji, Shraddhan Jain, Michael Ferdman, Peter Milder, H. Andrew Schwartz, Niranjan Balasubramanian

This informs the design of an inference-time quantization technique using both pruning and log-scaled mapping which produces only a few (e. g. $2^3$) unique values.

Quantization Question Answering +1

Paper
Add Code

IrEne: Interpretable Energy Prediction for Transformers

1 code implementation • ACL 2021 • Qingqing Cao, Yash Kumar Lal, Harsh Trivedi, Aruna Balasubramanian, Niranjan Balasubramanian

We present IrEne, an interpretable and extensible energy prediction system that accurately predicts the inference energy consumption of a wide range of Transformer-based NLP models.

Paper
Code

Bew: Towards Answering Business-Entity-Related Web Questions

no code implementations • 10 Dec 2020 • Qingqing Cao, Oriana Riva, Aruna Balasubramanian, Niranjan Balasubramanian

We present a practical approach, called BewQA, that can answer Bew queries by mining a template of the business-related webpages and using the template to guide the search.

Paper
Add Code

Open4Business(O4B): An Open Access Dataset for Summarizing Business Documents

1 code implementation • 15 Nov 2020 • Amanpreet Singh, Niranjan Balasubramanian

The dataset introduces a new challenge for summarization in the business domain, requiring highly abstractive and more concise summaries as compared to other existing datasets.

Paper
Code

Author's Sentiment Prediction

1 code implementation • COLING 2020 • Mohaddeseh Bastan, Mahnaz Koupaee, Youngseo Son, Richard Sicoli, Niranjan Balasubramanian

We introduce PerSenT, a dataset of crowd-sourced annotations of the sentiment expressed by the authors towards the main entities in news articles.

Sentiment Analysis

Paper
Code

LANNS: A Web-Scale Approximate Nearest Neighbor Lookup System

no code implementations • 19 Oct 2020 • Ishita Doshi, Dhritiman Das, Ashish Bhutani, Rajeev Kumar, Rushi Bhatt, Niranjan Balasubramanian

Nearest neighbor search (NNS) has a wide range of applications in information retrieval, computer vision, machine learning, databases, and other areas.

Information Retrieval Playing the Game of 2048 +1

Paper
Add Code

Towards Accurate and Reliable Energy Measurement of NLP Models

1 code implementation • EMNLP (sustainlp) 2020 • Qingqing Cao, Aruna Balasubramanian, Niranjan Balasubramanian

In this work, we show that existing software-based energy measurements are not accurate because they do not take into account hardware differences and how resource utilization affects energy consumption.

Question Answering

Paper
Code

Modeling Preconditions in Text with a Crowd-sourced Dataset

1 code implementation • Findings of the Association for Computational Linguistics 2020 • Heeyoung Kwon, Mahnaz Koupaee, Pratyush Singh, Gargi Sawhney, Anmol Shukla, Keerthi Kumar Kallur, Nathanael Chambers, Niranjan Balasubramanian

This paper introduces PeKo, a crowd-sourced annotation of preconditions between event pairs in newswire, an order of magnitude larger than prior text annotations.

Paper
Code

Hierarchical Modeling for User Personality Prediction: The Role of Message-Level Attention

no code implementations • ACL 2020 • Veronica Lynn, Niranjan Balasubramanian, H. Andrew Schwartz

Not all documents are equally important.

Paper
Add Code

Modeling Label Semantics for Predicting Emotional Reactions

1 code implementation • ACL 2020 • Radhika Gaonkar, Heeyoung Kwon, Mohaddeseh Bastan, Niranjan Balasubramanian, Nathanael Chambers

Predicting how events induce emotions in the characters of a story is typically seen as a standard multi-label classification task, which usually treats labels as anonymous classes to predict.

Ranked #1 on Emotion Classification on ROCStories

Emotion Classification Multi-Label Classification

Paper
Code

Is Multihop QA in DiRe Condition? Measuring and Reducing Disconnected Reasoning

1 code implementation • EMNLP 2020 • Harsh Trivedi, Niranjan Balasubramanian, Tushar Khot, Ashish Sabharwal

For a recent large-scale model (XLNet), we show that only 18 points out of its answer F1 score of 72 on HotpotQA are obtained through multifact reasoning, roughly the same as that of a simpler RNN baseline.

Multi-hop Question Answering Question Answering +1

Paper
Code

DeFormer: Decomposing Pre-trained Transformers for Faster Question Answering

1 code implementation • ACL 2020 • Qingqing Cao, Harsh Trivedi, Aruna Balasubramanian, Niranjan Balasubramanian

It turns out that we can get by without input-wide self-attention at all layers, especially in the lower layers.

Question Answering

120

Paper
Code

Generating Narrative Text in a Switching Dynamical System

1 code implementation • CONLL 2020 • Noah Weber, Leena Shekhar, Heeyoung Kwon, Niranjan Balasubramanian, Nathanael Chambers

A SLDS is a dynamical system in which the latent dynamics of the system (i. e. how the state vector transforms over time) is controlled by top-level discrete switching variables.

Text Generation

Paper
Code

Adaptive Activation Network and Functional Regularization for Efficient and Flexible Deep Multi-Task Learning

no code implementations • 19 Nov 2019 • Yingru Liu, Xuewen Yang, Dongliang Xie, Xin Wang, Li Shen, Hao-Zhi Huang, Niranjan Balasubramanian

In this paper, we propose a novel deep learning model called Task Adaptive Activation Network (TAAN) that can automatically learn the optimal network architecture for MTL.

Multi-Task Learning

Paper
Add Code

Latent Part-of-Speech Sequences for Neural Machine Translation

no code implementations • IJCNLP 2019 • Xuewen Yang, Yingru Liu, Dongliang Xie, Xin Wang, Niranjan Balasubramanian

In this work, we introduce a new latent variable model, LaSyn, that captures the co-dependence between syntax and semantics, while allowing for effective and efficient inference over the latent space.

Machine Translation NMT +1

Paper
Add Code

Tweet Classification without the Tweet: An Empirical Examination of User versus Document Attributes

no code implementations • WS 2019 • Veronica Lynn, Salvatore Giorgi, Niranjan Balasubramanian, H. Andrew Schwartz

NLP naturally puts a primary focus on leveraging document language, occasionally considering user attributes as supplemental.

General Classification Stance Detection

Paper
Add Code

Repurposing Entailment for Multi-Hop Question Answering Tasks

4 code implementations • NAACL 2019 • Harsh Trivedi, Heeyoung Kwon, Tushar Khot, Ashish Sabharwal, Niranjan Balasubramanian

We introduce Multee, a general architecture that can effectively use entailment models for multi-hop QA tasks.

Multi-hop Question Answering Question Answering +1

Paper
Code

PoMo: Generating Entity-Specific Post-Modifiers in Context

no code implementations • NAACL 2019 • Jun Seok Kang, Robert L. Logan IV, Zewei Chu, Yang Chen, Dheeru Dua, Kevin Gimpel, Sameer Singh, Niranjan Balasubramanian

Given a sentence about a target entity, the task is to automatically generate a post-modifier phrase that provides contextually relevant information about the entity.

Sentence

Paper
Add Code

Residualized Factor Adaptation for Community Social Media Prediction Tasks

no code implementations • EMNLP 2018 • Mohammadzaman Zamani, H. Andrew Schwartz, Veronica E. Lynn, Salvatore Giorgi, Niranjan Balasubramanian

Predictive models over social media language have shown promise in capturing community outcomes, but approaches thus far largely neglect the socio-demographic context (e. g. age, education rates, race) of the community from which the language originates.

Paper
Add Code

Hierarchical Quantized Representations for Script Generation

1 code implementation • EMNLP 2018 • Noah Weber, Leena Shekhar, Niranjan Balasubramanian, Nathanael Chambers

This permits the decoder to softly decide what portions of the latent hierarchy to condition on by attending over the value embeddings for a given setting.

Language Modelling Quantization

Paper
Code

Fake Sentence Detection as a Training Task for Sentence Encoding

no code implementations • ICLR 2019 • Viresh Ranjan, Heeyoung Kwon, Niranjan Balasubramanian, Minh Hoai

We automatically generate fake sentences by corrupting original sentences from a source collection and train the encoders to produce representations that are effective at detecting fake sentences.

Binary Classification Language Modelling +1

Paper
Add Code

The Fine Line between Linguistic Generalization and Failure in Seq2Seq-Attention Models

2 code implementations • WS 2018 • Noah Weber, Leena Shekhar, Niranjan Balasubramanian

Seq2Seq based neural architectures have become the go-to architecture to apply to sequence to sequence language tasks.

Paper
Code

Controlling Decoding for More Abstractive Summaries with Copy-Based Networks

no code implementations • 19 Mar 2018 • Noah Weber, Leena Shekhar, Niranjan Balasubramanian, Kyunghyun Cho

Attention-based neural abstractive summarization systems equipped with copy mechanisms have shown promising results.

Abstractive Text Summarization Extractive Summarization

Paper
Add Code

Event Representations with Tensor-based Compositions

1 code implementation • 21 Nov 2017 • Noah Weber, Niranjan Balasubramanian, Nathanael Chambers

Robust and flexible event representations are important to many core areas in language understanding.

Paper
Code

Human Centered NLP with User-Factor Adaptation

no code implementations • EMNLP 2017 • Veronica Lynn, Youngseo Son, Vivek Kulkarni, Niranjan Balasubramanian, H. Andrew Schwartz

We pose the general task of user-factor adaptation {--} adapting supervised learning models to real-valued user factors inferred from a background of their language, reflecting the idea that a piece of text should be understood within the context of the user that wrote it.

Document Classification Domain Adaptation +5