no code implementations • 1 Apr 2024 • Weixin Liang, Yaohui Zhang, Zhengxuan Wu, Haley Lepp, Wenlong Ji, Xuandong Zhao, Hancheng Cao, Sheng Liu, Siyu He, Zhi Huang, Diyi Yang, Christopher Potts, Christopher D Manning, James Y. Zou
To address this gap, we conduct the first systematic, large-scale analysis across 950, 965 papers published between January 2020 and February 2024 on the arXiv, bioRxiv, and Nature portfolio journals, using a population-level statistical framework to measure the prevalence of LLM-modified content over time.
no code implementations • 11 Mar 2024 • Weixin Liang, Zachary Izzo, Yaohui Zhang, Haley Lepp, Hancheng Cao, Xuandong Zhao, Lingjiao Chen, Haotian Ye, Sheng Liu, Zhi Huang, Daniel A. McFarland, James Y. Zou
We present an approach for estimating the fraction of text in a large corpus which is likely to be substantially modified or produced by a large language model (LLM).
1 code implementation • 4 Oct 2023 • Hancheng Cao, Jesse Dodge, Kyle Lo, Daniel A. McFarland, Lucy Lu Wang
In recent years, funding agencies and journals increasingly advocate for open science practices (e. g. data and method sharing) to improve the transparency, access, and reproducibility of science.
1 code implementation • 3 Oct 2023 • Weixin Liang, Yuhui Zhang, Hancheng Cao, Binglu Wang, Daisy Ding, Xinyu Yang, Kailas Vodrahalli, Siyu He, Daniel Smith, Yian Yin, Daniel McFarland, James Zou
We first quantitatively compared GPT-4's generated feedback with human peer reviewer feedback in 15 Nature family journals (3, 096 papers in total) and the ICLR machine learning conference (1, 709 papers).
no code implementations • 26 Sep 2023 • Jie Li, Hancheng Cao, Laura Lin, Youyang Hou, Ruihao Zhu, Abdallah El Ali
They emphasized the unique human factors of "enjoyment" and "agency", where humans remain the arbiters of "AI alignment".
no code implementations • 3 Aug 2023 • Hancheng Cao, Sofia Eleni Spatharioti, Daniel G. Goldstein, Jake M. Hofman
Numerical perspectives help people understand extreme and unfamiliar numbers (e. g., \$330 billion is about \$1, 000 per person in the United States).
1 code implementation • 19 Dec 2022 • Mina Lee, Megha Srivastava, Amelia Hardy, John Thickstun, Esin Durmus, Ashwin Paranjape, Ines Gerard-Ursin, Xiang Lisa Li, Faisal Ladhak, Frieda Rong, Rose E. Wang, Minae Kwon, Joon Sung Park, Hancheng Cao, Tony Lee, Rishi Bommasani, Michael Bernstein, Percy Liang
To evaluate human-LM interaction, we develop a new framework, Human-AI Language-based Interaction Evaluation (HALIE), that defines the components of interactive systems and dimensions to consider when designing evaluation metrics.
no code implementations • 8 Aug 2022 • Zhilong Chen, Jinghua Piao, Xiaochong Lan, Hancheng Cao, Chen Gao, Zhicong Lu, Yong Li
Recommender systems are playing an increasingly important role in alleviating information overload and supporting users' various needs, e. g., consumption, socialization, and entertainment.
no code implementations • 14 Oct 2020 • Hancheng Cao, Vivian Yang, Victor Chen, Yu Jin Lee, Lydia Stone, N'godjigui Junior Diarrassouba, Mark E. Whiting, Michael S. Bernstein
From these models, we identify the use of exclusive language such as `but' and `except', and the use of second person pronouns, as the most predictive features for detecting the most viable teams, suggesting that active engagement with others' ideas is a crucial signal of a viable team.
no code implementations • Findings of the Association for Computational Linguistics 2020 • Hancheng Cao, Mengjie Cheng, Zhepeng Cen, Daniel A. McFarland, Xiang Ren
We extract scientific concepts (i. e., phrases) from corpora as instantiations of "research ideas", create concept-level features as motivated by literature, and then follow the trajectories of over 450, 000 new concepts (emerged from 1995-2014) to identify factors that lead only a small proportion of these ideas to be used in inventions and drug trials.