no code implementations • 18 Apr 2024 • Shreya Shankar, J. D. Zamfirescu-Pereira, Björn Hartmann, Aditya G. Parameswaran, Ian Arawjo
In particular, we identify a phenomenon we dub \emph{criteria drift}: users need criteria to grade outputs, but grading outputs helps users define criteria.
no code implementations • 12 Feb 2024 • Alice Cai, Ian Arawjo, Elena L. Glassman
The vast majority of discourse around AI development assumes that subservient, "moral" models aligned with "human values" are universally beneficial -- in short, that good AI is sycophantic AI.
no code implementations • 12 Feb 2024 • Priyan Vaithilingam, Ian Arawjo, Elena L. Glassman
We ideate a future design workflow that involves AI technology.
1 code implementation • 17 Sep 2023 • Ian Arawjo, Chelse Swoopes, Priyan Vaithilingam, Martin Wattenberg, Elena Glassman
Evaluating outputs of large language models (LLMs) is challenging, requiring making -- and making sense of -- many responses.