no code implementations • 8 Mar 2024 • Xuhui Zhou, Zhe Su, Tiwalayo Eisape, Hyunwoo Kim, Maarten Sap
Recent advances in large language models (LLM) have enabled richer social simulations, allowing for the study of various social phenomena.
no code implementations • 27 Oct 2023 • Niloofar Mireshghallah, Hyunwoo Kim, Xuhui Zhou, Yulia Tsvetkov, Maarten Sap, Reza Shokri, Yejin Choi
The interactive use of large language models (LLMs) in AI assistants (at work, home, etc.)
no code implementations • 24 Oct 2023 • Hyunwoo Kim, Melanie Sclar, Xuhui Zhou, Ronan Le Bras, Gunhee Kim, Yejin Choi, Maarten Sap
Theory of mind (ToM) evaluations currently focus on testing models using passive narratives that inherently lack interactivity.
1 code implementation • 18 Oct 2023 • Xuhui Zhou, Hao Zhu, Leena Mathur, Ruohong Zhang, Haofei Yu, Zhengyang Qi, Louis-Philippe Morency, Yonatan Bisk, Daniel Fried, Graham Neubig, Maarten Sap
We present SOTOPIA, an open-ended environment to simulate complex social interactions between artificial agents and evaluate their social intelligence.
1 code implementation • 25 Jul 2023 • Shuyan Zhou, Frank F. Xu, Hao Zhu, Xuhui Zhou, Robert Lo, Abishek Sridhar, Xianyi Cheng, Tianyue Ou, Yonatan Bisk, Daniel Fried, Uri Alon, Graham Neubig
Building upon our environment, we release a set of benchmark tasks focusing on evaluating the functional correctness of task completions.
no code implementations • 3 Jun 2023 • Xuhui Zhou, Hao Zhu, Akhila Yerukola, Thomas Davidson, Jena D. Hwang, Swabha Swayamdipta, Maarten Sap
To study the contextual dynamics of offensiveness, we train models to generate COBRA explanations, with and without access to the context.
no code implementations • 24 May 2023 • Akhila Yerukola, Xuhui Zhou, Elizabeth Clark, Maarten Sap
Most existing stylistic text rewriting methods and evaluation metrics operate on a sentence level, but ignoring the broader context of the text can lead to preferring generic, ambiguous, and incoherent rewrites.
no code implementations • 24 May 2023 • Natalie Shapira, Mosh Levy, Seyed Hossein Alavi, Xuhui Zhou, Yejin Choi, Yoav Goldberg, Maarten Sap, Vered Shwartz
The escalating debate on AI's capabilities warrants developing reliable metrics to assess machine "intelligence".
1 code implementation • 14 Jul 2022 • C. M. Downey, Xuhui Zhou, Leo Z. Liu, Shane Steinert-Threlkeld
We formulate and test a technique to use Emergent Communication (EC) with a pre-trained multilingual model to improve on modern Unsupervised NMT systems, especially for low-resource languages.
no code implementations • NAACL 2022 • Maarten Sap, Swabha Swayamdipta, Laura Vianna, Xuhui Zhou, Yejin Choi, Noah A. Smith
The perceived toxicity of language can vary based on someone's identity and beliefs, but this variation is often ignored when collecting toxic language datasets, resulting in dataset and model biases.
1 code implementation • NLP4ConvAI (ACL) 2022 • Zhilin Wang, Xuhui Zhou, Rik Koncel-Kedziorski, Alex Marin, Fei Xia
Personal attributes represent structured information about a person, such as their hobbies, pets, family, likes and dislikes.
2 code implementations • EACL 2021 • Xuhui Zhou, Maarten Sap, Swabha Swayamdipta, Noah A. Smith, Yejin Choi
Overall, our findings show that debiasing a model trained on biased toxic language data is not as effective as simply relabeling the data to remove existing biases.
no code implementations • EMNLP (BlackboxNLP) 2020 • Chuanrong Li, Lin Shengshuo, Leo Z. Liu, Xinyi Wu, Xuhui Zhou, Shane Steinert-Threlkeld
Although large-scale pretrained language models, such as BERT and RoBERTa, have achieved superhuman performance on in-distribution test sets, their performance suffers on out-of-distribution test sets (e. g., on contrast sets).
1 code implementation • EMNLP 2020 • Xuhui Zhou, Nikolaos Pappas, Noah A. Smith
Text alignment finds application in tasks such as citation recommendation and plagiarism detection.
no code implementations • ACL 2020 • Xuhui Zhou, Zaixiang Zheng, Shu-Jian Huang
Based on the properties of RPD, we study the relations of word embeddings of different algorithms systematically and investigate the influence of different training processes and corpora.
1 code implementation • 27 Nov 2019 • Xuhui Zhou, Yue Zhang, Leyang Cui, Dandan Huang
However, relatively little work has been done investigating commonsense knowledge contained in contextualized representations, which is crucial for human question answering and reading comprehension.
no code implementations • 22 Nov 2019 • Shengwen Yang, Bing Ren, Xuhui Zhou, Li-Ping Liu
The system is built on the pa-rameter server architecture and aims to speed up the model training via utilizing a cluster of servers in case of large volume of training data.
1 code implementation • Preprint 2019 • Zhiwei Zhai, Marius Staring, Xuhui Zhou, Qiuxia Xie, Xiaojuan Xiao, M. Els Bakker, Lucia J. Kroft, Boudewijn P. F. Lelieveldt, Gudula J.A.M. Boon, Frederikus A. Klok, Berend C. Stoel
In conclusion, the proposed CNN-GCN method combines local image information with graph connectivity information, improving pulmonary A/V separation over a baseline CNN method, approaching the performance of human observers.
Ranked #1 on Pulmonary Artery–Vein Classification on SunYs
3D Medical Imaging Segmentation Pulmonary Artery–Vein Classification