Search Results for author: Zhengxuan Wu

Found 30 papers, 25 papers with code

Structured Self-AttentionWeights Encode Semantics in Sentiment Analysis

1 code implementation • EMNLP (BlackboxNLP) 2020 • Zhengxuan Wu, Thanh-Son Nguyen, Desmond Ong

Very recent work suggests that the self-attention in the Transformer encodes syntactic information; Here, we show that self-attention scores encode semantics by considering sentiment analysis tasks.

Sentiment Analysis Time Series +1

Paper
Code

ReFT: Representation Finetuning for Language Models

2 code implementations • 4 Apr 2024 • Zhengxuan Wu, Aryaman Arora, Zheng Wang, Atticus Geiger, Dan Jurafsky, Christopher D. Manning, Christopher Potts

LoReFT is a drop-in replacement for existing PEFTs and learns interventions that are 10x-50x more parameter-efficient than prior state-of-the-art PEFTs.

Arithmetic Reasoning

581

Paper
Code

Mapping the Increasing Use of LLMs in Scientific Papers

no code implementations • 1 Apr 2024 • Weixin Liang, Yaohui Zhang, Zhengxuan Wu, Haley Lepp, Wenlong Ji, Xuandong Zhao, Hancheng Cao, Sheng Liu, Siyu He, Zhi Huang, Diyi Yang, Christopher Potts, Christopher D Manning, James Y. Zou

To address this gap, we conduct the first systematic, large-scale analysis across 950, 965 papers published between January 2020 and February 2024 on the arXiv, bioRxiv, and Nature portfolio journals, using a population-level statistical framework to measure the prevalence of LLM-modified content over time.

Paper
Add Code

pyvene: A Library for Understanding and Improving PyTorch Models via Interventions

3 code implementations • 12 Mar 2024 • Zhengxuan Wu, Atticus Geiger, Aryaman Arora, Jing Huang, Zheng Wang, Noah D. Goodman, Christopher D. Manning, Christopher Potts

Interventions on model-internal states are fundamental operations in many areas of AI, including model editing, steering, robustness, and interpretability.

Model Editing

581

Paper
Code

In-Context Sharpness as Alerts: An Inner Representation Perspective for Hallucination Mitigation

1 code implementation • 3 Mar 2024 • Shiqi Chen, Miao Xiong, Junteng Liu, Zhengxuan Wu, Teng Xiao, Siyang Gao, Junxian He

Large language models (LLMs) frequently hallucinate and produce factual errors, yet our understanding of why they make these errors remains limited.

Hallucination

Paper
Code

RAVEL: Evaluating Interpretability Methods on Disentangling Language Model Representations

1 code implementation • 27 Feb 2024 • Jing Huang, Zhengxuan Wu, Christopher Potts, Mor Geva, Atticus Geiger

Individual neurons participate in the representation of multiple high-level concepts.

Attribute Language Modelling

Paper
Code

A Reply to Makelov et al. (2023)'s "Interpretability Illusion" Arguments

1 code implementation • 23 Jan 2024 • Zhengxuan Wu, Atticus Geiger, Jing Huang, Aryaman Arora, Thomas Icard, Christopher Potts, Noah D. Goodman

We respond to the recent paper by Makelov et al. (2023), which reviews subspace interchange intervention methods like distributed alignment search (DAS; Geiger et al. 2023) and claims that these methods potentially cause "interpretability illusions".

Paper
Code

Rigorously Assessing Natural Language Explanations of Neurons

no code implementations • 19 Sep 2023 • Jing Huang, Atticus Geiger, Karel D'Oosterlinck, Zhengxuan Wu, Christopher Potts

Natural language is an appealing medium for explaining how large language models process and store information, but evaluating the faithfulness of such explanations is challenging.

Paper
Add Code

MQuAKE: Assessing Knowledge Editing in Language Models via Multi-Hop Questions

2 code implementations • 24 May 2023 • Zexuan Zhong, Zhengxuan Wu, Christopher D. Manning, Christopher Potts, Danqi Chen

The information stored in large language models (LLMs) falls out of date quickly, and retraining from scratch is often not an option.

knowledge editing Language Modelling +2

Paper
Code

Interpretability at Scale: Identifying Causal Mechanisms in Alpaca

1 code implementation • NeurIPS 2023 • Zhengxuan Wu, Atticus Geiger, Thomas Icard, Christopher Potts, Noah D. Goodman

With Boundless DAS, we discover that Alpaca does this by implementing a causal model with two interpretable boolean variables.

455

Paper
Code

ReCOGS: How Incidental Details of a Logical Form Overshadow an Evaluation of Semantic Interpretation

1 code implementation • 24 Mar 2023 • Zhengxuan Wu, Christopher D. Manning, Christopher Potts

We argue that this concern is realized for the COGS benchmark.

Semantic Parsing

Paper
Code

Finding Alignments Between Interpretable Causal Variables and Distributed Neural Representations

no code implementations • 5 Mar 2023 • Atticus Geiger, Zhengxuan Wu, Christopher Potts, Thomas Icard, Noah D. Goodman

In DAS, we find the alignment between high-level and low-level models using gradient descent rather than conducting a brute-force search, and we allow individual neurons to play multiple distinct roles by analyzing representations in non-standard bases-distributed representations.

Explainable artificial intelligence

Paper
Add Code

Inducing Character-level Structure in Subword-based Language Models with Type-level Interchange Intervention Training

1 code implementation • 19 Dec 2022 • Jing Huang, Zhengxuan Wu, Kyle Mahowald, Christopher Potts

Language tasks involving character-level manipulations (e. g., spelling corrections, arithmetic operations, word games) are challenging for models operating on subword units.

Spelling Correction

Paper
Code

Causal Proxy Models for Concept-Based Model Explanations

1 code implementation • 28 Sep 2022 • Zhengxuan Wu, Karel D'Oosterlinck, Atticus Geiger, Amir Zur, Christopher Potts

The core of our proposal is the Causal Proxy Model (CPM).

Causal Inference counterfactual

Paper
Code

ZeroC: A Neuro-Symbolic Model for Zero-shot Concept Recognition and Acquisition at Inference Time

1 code implementation • 30 Jun 2022 • Tailin Wu, Megan Tjandrasuwita, Zhengxuan Wu, Xuelin Yang, Kevin Liu, Rok Sosič, Jure Leskovec

In this work, we introduce Zero-shot Concept Recognition and Acquisition (ZeroC), a neuro-symbolic architecture that can recognize and acquire novel concepts in a zero-shot way.

Novel Concepts

Paper
Code

CEBaB: Estimating the Causal Effects of Real-World Concepts on NLP Model Behavior

1 code implementation • 27 May 2022 • Eldar David Abraham, Karel D'Oosterlinck, Amir Feder, Yair Ori Gat, Atticus Geiger, Christopher Potts, Roi Reichart, Zhengxuan Wu

We introduce CEBaB, a new benchmark dataset for assessing concept-based explanation methods in Natural Language Processing (NLP).

Causal Inference counterfactual

Paper
Code

Oolong: Investigating What Makes Transfer Learning Hard with Controlled Studies

1 code implementation • 24 Feb 2022 • Zhengxuan Wu, Alex Tamkin, Isabel Papadimitriou

When we transfer a pretrained language model to a new language, there are many axes of variation that change at once.

Cross-Lingual Transfer Language Modelling +1

Paper
Code

Causal Distillation for Language Models

1 code implementation • NAACL 2022 • Zhengxuan Wu, Atticus Geiger, Josh Rozner, Elisa Kreiss, Hanson Lu, Thomas Icard, Christopher Potts, Noah D. Goodman

Distillation efforts have led to language models that are more compact and efficient without serious drops in performance.

Language Modelling Masked Language Modeling +5

Paper
Code

Inducing Causal Structure for Interpretable Neural Networks

2 code implementations • 1 Dec 2021 • Atticus Geiger, Zhengxuan Wu, Hanson Lu, Josh Rozner, Elisa Kreiss, Thomas Icard, Noah D. Goodman, Christopher Potts

In IIT, we (1) align variables in a causal model (e. g., a deterministic program or Bayesian network) with representations in a neural model and (2) train the neural model to match the counterfactual behavior of the causal model on a base input when aligned representations in both models are set to be the value they would be for a source input.

counterfactual Data Augmentation +1

Paper
Code

ReaSCAN: Compositional Reasoning in Language Grounding

3 code implementations • 18 Sep 2021 • Zhengxuan Wu, Elisa Kreiss, Desmond C. Ong, Christopher Potts

The ability to compositionally map language to referents, relations, and actions is an essential component of language understanding.

Paper
Code

Identifying the Limits of Cross-Domain Knowledge Transfer for Pretrained Models

1 code implementation • RepL4NLP (ACL) 2022 • Zhengxuan Wu, Nelson F. Liu, Christopher Potts

There is growing evidence that pretrained language models improve task-specific fine-tuning not just for the languages seen in pretraining, but also for new languages and even non-linguistic data.

Transfer Learning

Paper
Code

Dynabench: Rethinking Benchmarking in NLP

no code implementations • NAACL 2021 • Douwe Kiela, Max Bartolo, Yixin Nie, Divyansh Kaushik, Atticus Geiger, Zhengxuan Wu, Bertie Vidgen, Grusha Prasad, Amanpreet Singh, Pratik Ringshia, Zhiyi Ma, Tristan Thrush, Sebastian Riedel, Zeerak Waseem, Pontus Stenetorp, Robin Jia, Mohit Bansal, Christopher Potts, Adina Williams

We introduce Dynabench, an open-source platform for dynamic dataset creation and model benchmarking.

Benchmarking

Paper
Add Code

On Explaining Your Explanations of BERT: An Empirical Study with Sequence Classification

2 code implementations • 1 Jan 2021 • Zhengxuan Wu, Desmond C. Ong

In this paper, we adapt existing attribution methods on explaining decision makings of BERT in sequence classification tasks.

General Classification Sentiment Analysis

Paper
Code

DynaSent: A Dynamic Benchmark for Sentiment Analysis

1 code implementation • ACL 2021 • Christopher Potts, Zhengxuan Wu, Atticus Geiger, Douwe Kiela

We introduce DynaSent ('Dynamic Sentiment'), a new English-language benchmark task for ternary (positive/negative/neutral) sentiment analysis.

Sentiment Analysis

161

Paper
Code

Context-Guided BERT for Targeted Aspect-Based Sentiment Analysis

1 code implementation • 15 Oct 2020 • Zhengxuan Wu, Desmond C. Ong

We train both models with pretrained BERT on two (T)ABSA datasets: SentiHood and SemEval-2014 (Task 4).

Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA)

Paper
Code

Structured Self-Attention Weights Encode Semantics in Sentiment Analysis

1 code implementation • 10 Oct 2020 • Zhengxuan Wu, Thanh-Son Nguyen, Desmond C. Ong

Very recent work suggests that the self-attention in the Transformer encodes syntactic information; Here, we show that self-attention scores encode semantics by considering sentiment analysis tasks.

Sentiment Analysis Time Series +1

Paper
Code

Pragmatically Informative Color Generation by Grounding Contextual Modifiers

1 code implementation • SCiL 2021 • Zhengxuan Wu, Desmond C. Ong

One important task that involves grounding contextual modifiers is color generation.

Natural Language Understanding

Paper
Code

Modeling emotion in complex stories: the Stanford Emotional Narratives Dataset

2 code implementations • 22 Nov 2019 • Desmond C. Ong, Zhengxuan Wu, Tan Zhi-Xuan, Marianne Reddan, Isabella Kahhale, Alison Mattek, Jamil Zaki

We begin by assessing the state-of-the-art in time-series emotion recognition, and we review contemporary time-series approaches in affective computing, including discriminative and generative models.

Emotion Recognition Time Series +1

Paper
Code

Disentangling Latent Emotions of Word Embeddings on Complex Emotional Narratives

no code implementations • 15 Aug 2019 • Zhengxuan Wu, Yueyi Jiang

We showed that, in the proposed emotion space, we were able to better disentangle emotions than using raw GloVe vectors alone.

Word Embeddings

Paper
Add Code

Attending to Emotional Narratives

1 code implementation • 8 Jul 2019 • Zhengxuan Wu, Xiyu Zhang, Tan Zhi-Xuan, Jamil Zaki, Desmond C. Ong

Attention mechanisms in deep neural networks have achieved excellent performance on sequence-prediction tasks.

Emotion Recognition Time Series +1

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.