Search Results for author: Xuhui Zhou

Found 18 papers, 8 papers with code

Is this the real life? Is this just fantasy? The Misleading Success of Simulating Social Interactions With LLMs

no code implementations • 8 Mar 2024 • Xuhui Zhou, Zhe Su, Tiwalayo Eisape, Hyunwoo Kim, Maarten Sap

Recent advances in large language models (LLM) have enabled richer social simulations, allowing for the study of various social phenomena with LLM-based agents.

Paper
Add Code

Can LLMs Keep a Secret? Testing Privacy Implications of Language Models via Contextual Integrity Theory

no code implementations • 27 Oct 2023 • Niloofar Mireshghallah, Hyunwoo Kim, Xuhui Zhou, Yulia Tsvetkov, Maarten Sap, Reza Shokri, Yejin Choi

The interactive use of large language models (LLMs) in AI assistants (at work, home, etc.)

Privacy Preserving

Paper
Add Code

FANToM: A Benchmark for Stress-testing Machine Theory of Mind in Interactions

no code implementations • 24 Oct 2023 • Hyunwoo Kim, Melanie Sclar, Xuhui Zhou, Ronan Le Bras, Gunhee Kim, Yejin Choi, Maarten Sap

Theory of mind (ToM) evaluations currently focus on testing models using passive narratives that inherently lack interactivity.

Question Answering

Paper
Add Code

SOTOPIA: Interactive Evaluation for Social Intelligence in Language Agents

1 code implementation • 18 Oct 2023 • Xuhui Zhou, Hao Zhu, Leena Mathur, Ruohong Zhang, Haofei Yu, Zhengyang Qi, Louis-Philippe Morency, Yonatan Bisk, Daniel Fried, Graham Neubig, Maarten Sap

We present SOTOPIA, an open-ended environment to simulate complex social interactions between artificial agents and evaluate their social intelligence.

Paper
Code

WebArena: A Realistic Web Environment for Building Autonomous Agents

1 code implementation • 25 Jul 2023 • Shuyan Zhou, Frank F. Xu, Hao Zhu, Xuhui Zhou, Robert Lo, Abishek Sridhar, Xianyi Cheng, Tianyue Ou, Yonatan Bisk, Daniel Fried, Uri Alon, Graham Neubig

Building upon our environment, we release a set of benchmark tasks focusing on evaluating the functional correctness of task completions.

533

Paper
Code

COBRA Frames: Contextual Reasoning about Effects and Harms of Offensive Statements

no code implementations • 3 Jun 2023 • Xuhui Zhou, Hao Zhu, Akhila Yerukola, Thomas Davidson, Jena D. Hwang, Swabha Swayamdipta, Maarten Sap

To study the contextual dynamics of offensiveness, we train models to generate COBRA explanations, with and without access to the context.

Paper
Add Code

Clever Hans or Neural Theory of Mind? Stress Testing Social Reasoning in Large Language Models

no code implementations • 24 May 2023 • Natalie Shapira, Mosh Levy, Seyed Hossein Alavi, Xuhui Zhou, Yejin Choi, Yoav Goldberg, Maarten Sap, Vered Shwartz

The escalating debate on AI's capabilities warrants developing reliable metrics to assess machine "intelligence".

Paper
Add Code

Don't Take This Out of Context! On the Need for Contextual Models and Evaluations for Stylistic Rewriting

no code implementations • 24 May 2023 • Akhila Yerukola, Xuhui Zhou, Elizabeth Clark, Maarten Sap

Most existing stylistic text rewriting methods and evaluation metrics operate on a sentence level, but ignoring the broader context of the text can lead to preferring generic, ambiguous, and incoherent rewrites.

Sentence

Paper
Add Code

Learning to translate by learning to communicate

1 code implementation • 14 Jul 2022 • C. M. Downey, Xuhui Zhou, Leo Z. Liu, Shane Steinert-Threlkeld

We formulate and test a technique to use Emergent Communication (EC) with a pre-trained multilingual model to improve on modern Unsupervised NMT systems, especially for low-resource languages.

Natural Language Understanding NMT

Paper
Code

Annotators with Attitudes: How Annotator Beliefs And Identities Bias Toxic Language Detection

no code implementations • NAACL 2022 • Maarten Sap, Swabha Swayamdipta, Laura Vianna, Xuhui Zhou, Yejin Choi, Noah A. Smith

The perceived toxicity of language can vary based on someone's identity and beliefs, but this variation is often ignored when collecting toxic language datasets, resulting in dataset and model biases.

Paper
Add Code

Extracting and Inferring Personal Attributes from Dialogue

1 code implementation • NLP4ConvAI (ACL) 2022 • Zhilin Wang, Xuhui Zhou, Rik Koncel-Kedziorski, Alex Marin, Fei Xia

Personal attributes represent structured information about a person, such as their hobbies, pets, family, likes and dislikes.

Attribute Language Modelling

Paper
Code

Challenges in Automated Debiasing for Toxic Language Detection

2 code implementations • EACL 2021 • Xuhui Zhou, Maarten Sap, Swabha Swayamdipta, Noah A. Smith, Yejin Choi

Overall, our findings show that debiasing a model trained on biased toxic language data is not as effective as simply relabeling the data to remove existing biases.

Fairness text-classification +1

Paper
Code

Linguistically-Informed Transformations (LIT): A Method for Automatically Generating Contrast Sets

no code implementations • EMNLP (BlackboxNLP) 2020 • Chuanrong Li, Lin Shengshuo, Leo Z. Liu, Xinyi Wu, Xuhui Zhou, Shane Steinert-Threlkeld

Although large-scale pretrained language models, such as BERT and RoBERTa, have achieved superhuman performance on in-distribution test sets, their performance suffers on out-of-distribution test sets (e. g., on contrast sets).

Paper
Add Code

Multilevel Text Alignment with Cross-Document Attention

1 code implementation • EMNLP 2020 • Xuhui Zhou, Nikolaos Pappas, Noah A. Smith

Text alignment finds application in tasks such as citation recommendation and plagiarism detection.

Citation Recommendation Sentence

Paper
Code

RPD: A Distance Function Between Word Embeddings

no code implementations • ACL 2020 • Xuhui Zhou, Zaixiang Zheng, Shu-Jian Huang

Based on the properties of RPD, we study the relations of word embeddings of different algorithms systematically and investigate the influence of different training processes and corpora.

Word Embeddings

Paper
Add Code

Evaluating Commonsense in Pre-trained Language Models

1 code implementation • 27 Nov 2019 • Xuhui Zhou, Yue Zhang, Leyang Cui, Dandan Huang

However, relatively little work has been done investigating commonsense knowledge contained in contextualized representations, which is crucial for human question answering and reading comprehension.

Language Modelling Question Answering +1

Paper
Code

Parallel Distributed Logistic Regression for Vertical Federated Learning without Third-Party Coordinator

no code implementations • 22 Nov 2019 • Shengwen Yang, Bing Ren, Xuhui Zhou, Li-Ping Liu

The system is built on the pa-rameter server architecture and aims to speed up the model training via utilizing a cluster of servers in case of large volume of training data.

regression Transfer Learning +1

Paper
Add Code

Linking convolutional neural networks with graph convolutional networks: application in pulmonary artery-vein separation

1 code implementation • Preprint 2019 • Zhiwei Zhai, Marius Staring, Xuhui Zhou, Qiuxia Xie, Xiaojuan Xiao, M. Els Bakker, Lucia J. Kroft, Boudewijn P. F. Lelieveldt, Gudula J.A.M. Boon, Frederikus A. Klok, Berend C. Stoel

In conclusion, the proposed CNN-GCN method combines local image information with graph connectivity information, improving pulmonary A/V separation over a baseline CNN method, approaching the performance of human observers.

Ranked #1 on Pulmonary Artery–Vein Classification on SunYs

3D Medical Imaging Segmentation Pulmonary Artery–Vein Classification

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.