no code implementations • 2 Apr 2024 • Qixiang Fang, Daniel L. Oberski, Dong Nguyen
Third, we release 4 datasets to support measuring and comparing LLM proficiency in grade school mathematics and science against human populations.
no code implementations • 20 Mar 2024 • Zhihan Zhou, Qixiang Fang, Leonardo Neves, Francesco Barbieri, Yozen Liu, Han Liu, Maarten W. Bos, Ron Dotsch
Furthermore, we introduce a novel training objective named future W-behavior prediction to transcend the limitations of next-token prediction by forecasting a broader horizon of upcoming user behaviors.
no code implementations • 19 Dec 2023 • Qixiang Fang, Zhihan Zhou, Francesco Barbieri, Yozen Liu, Leonardo Neves, Dong Nguyen, Daniel L. Oberski, Maarten W. Bos, Ron Dotsch
Using this new framework, we design a Transformer-based user model that can produce high-quality general-purpose user representations for instant messaging platforms like Snapchat.
no code implementations • 2 May 2023 • Anya Belz, Craig Thomson, Ehud Reiter, Gavin Abercrombie, Jose M. Alonso-Moral, Mohammad Arvan, Anouck Braggaar, Mark Cieliebak, Elizabeth Clark, Kees Van Deemter, Tanvi Dinkar, Ondřej Dušek, Steffen Eger, Qixiang Fang, Mingqi Gao, Albert Gatt, Dimitra Gkatzia, Javier González-Corbelle, Dirk Hovy, Manuela Hürlimann, Takumi Ito, John D. Kelleher, Filip Klubicka, Emiel Krahmer, Huiyuan Lai, Chris van der Lee, Yiru Li, Saad Mahamood, Margot Mieskes, Emiel van Miltenburg, Pablo Mosteiro, Malvina Nissim, Natalie Parde, Ondřej Plátek, Verena Rieser, Jie Ruan, Joel Tetreault, Antonio Toral, Xiaojun Wan, Leo Wanner, Lewis Watson, Diyi Yang
We report our efforts in identifying a set of previous human evaluations in NLP that would be suitable for a coordinated study examining what makes human evaluations in NLP more/less reproducible.
1 code implementation • 27 Feb 2023 • Christian Fang, Qixiang Fang, Dong Nguyen
We describe our experiments for SemEval-2023 Task 4 on the identification of human values behind arguments (ValueEval).
no code implementations • 13 Dec 2022 • Qixiang Fang, Anastasia Giachanou, Ayoub Bagheri
Stance detection (SD) can be considered a special case of textual entailment recognition (TER), a generic natural language task.
no code implementations • 13 Dec 2022 • Qixiang Fang, Anastasia Giachanou, Ayoub Bagheri, Laura Boeschoten, Erik-Jan van Kesteren, Mahdi Shafiee Kamalabad, Daniel L Oberski
Text-based personality computing (TPC) has gained many research interests in NLP.
1 code implementation • 18 Feb 2022 • Qixiang Fang, Dong Nguyen, Daniel L Oberski
Our results thus highlight the necessity to examine the construct validity of text embeddings before deploying them in social science research.
1 code implementation • EMNLP 2021 • Yupei Du, Qixiang Fang, Dong Nguyen
In this paper, we assess three types of reliability of word embedding gender bias measures, namely test-retest reliability, inter-rater consistency and internal consistency.
1 code implementation • 22 Jun 2020 • Rens van de Schoot, Jonathan de Bruin, Raoul Schram, Parisa Zahedi, Jan de Boer, Felix Weijdema, Bianca Kramer, Martijn Huijts, Maarten Hoogerwerf, Gerbrich Ferdinands, Albert Harkema, Joukje Willemsen, Yongchao Ma, Qixiang Fang, Sybren Hindriks, Lars Tummers, Daniel Oberski
To help researchers conduct a systematic review or meta-analysis as efficiently and transparently as possible, we designed a tool (ASReview) to accelerate the step of screening titles and abstracts.