1 code implementation • 28 Nov 2023 • Harsha Nori, Yin Tat Lee, Sheng Zhang, Dean Carignan, Richard Edgar, Nicolo Fusi, Nicholas King, Jonathan Larson, Yuanzhi Li, Weishung Liu, Renqian Luo, Scott Mayer McKinney, Robert Osazuwa Ness, Hoifung Poon, Tao Qin, Naoto Usuyama, Chris White, Eric Horvitz
We find that prompting innovation can unlock deeper specialist capabilities and show that GPT-4 easily tops prior leading results for medical benchmarks.
Ranked #2 on Question Answering on MedQA
no code implementations • 16 Nov 2023 • Abigail Sellen, Eric Horvitz
This calls for designs for human-AI partnership that cede ultimate control and responsibility to the human user as pilot, with the AI co-pilot acting in a well-defined supporting role.
no code implementations • 6 Jul 2023 • Markus Anderljung, Joslyn Barnhart, Anton Korinek, Jade Leung, Cullen O'Keefe, Jess Whittlestone, Shahar Avin, Miles Brundage, Justin Bullock, Duncan Cass-Beggs, Ben Chang, Tantum Collins, Tim Fist, Gillian Hadfield, Alan Hayes, Lewis Ho, Sara Hooker, Eric Horvitz, Noam Kolt, Jonas Schuett, Yonadav Shavit, Divya Siddarth, Robert Trager, Kevin Wolf
To address these challenges, at least three building blocks for the regulation of frontier models are needed: (1) standard-setting processes to identify appropriate requirements for frontier AI developers, (2) registration and reporting requirements to provide regulators with visibility into frontier AI development processes, and (3) mechanisms to ensure compliance with safety standards for the development and deployment of frontier AI models.
1 code implementation • 12 Jun 2023 • Serina Chang, Adam Fourney, Eric Horvitz
We find that holdouts, compared to early adopters matched on covariates, are 69% more likely to click on untrusted news sites.
1 code implementation • 8 Jun 2023 • Hussein Mozannar, Gagan Bansal, Adam Fourney, Eric Horvitz
Using data from 535 programmers, we perform a retrospective evaluation of CDHF and show that we can avoid displaying a significant fraction of suggestions that would have been rejected.
no code implementations • 26 Apr 2023 • Debadutta Dash, Rahul Thapa, Juan M. Banda, Akshay Swaminathan, Morgan Cheatham, Mehr Kashyap, Nikesh Kotecha, Jonathan H. Chen, Saurabh Gombar, Lance Downing, Rachel Pedreira, Ethan Goh, Angel Arnaout, Garret Kenn Morris, Honor Magon, Matthew P Lungren, Eric Horvitz, Nigam H. Shah
Our objective was to determine whether two LLMs can serve information needs submitted by physicians as questions to an informatics consultation service in a safe and concordant manner.
no code implementations • 29 Mar 2023 • Michael Poli, Stefano Massaroli, Stefano Ermon, Bryan Wilder, Eric Horvitz
We present a methodology for formulating simplifying abstractions in machine learning systems by identifying and harnessing the utility structure of decisions.
2 code implementations • 22 Mar 2023 • Sébastien Bubeck, Varun Chandrasekaran, Ronen Eldan, Johannes Gehrke, Eric Horvitz, Ece Kamar, Peter Lee, Yin Tat Lee, Yuanzhi Li, Scott Lundberg, Harsha Nori, Hamid Palangi, Marco Tulio Ribeiro, Yi Zhang
We contend that (this early version of) GPT-4 is part of a new cohort of LLMs (along with ChatGPT and Google's PaLM for example) that exhibit more general intelligence than previous AI models.
Ranked #33 on Arithmetic Reasoning on GSM8K
no code implementations • 20 Mar 2023 • Harsha Nori, Nicholas King, Scott Mayer McKinney, Dean Carignan, Eric Horvitz
We also evaluate performance on the MultiMedQA suite of benchmark datasets.
1 code implementation • 20 Dec 2022 • Tejas Gokhale, Hamid Palangi, Besmira Nushi, Vibhav Vineet, Eric Horvitz, Ece Kamar, Chitta Baral, Yezhou Yang
We investigate the ability of T2I models to generate correct spatial relationships among objects and present VISOR, an evaluation metric that captures how accurately the spatial relationship described in text is generated in the image.
1 code implementation • 25 Oct 2022 • Hussein Mozannar, Gagan Bansal, Adam Fourney, Eric Horvitz
However, to fully realize their potential, we must understand how programmers interact with these systems and identify ways to improve that interaction.
no code implementations • 5 Sep 2022 • Eric Horvitz
Over a five-year period, computing methods for generating high-fidelity, fictional depictions of people and events moved from exotic demonstrations by computer science research teams into ongoing use as a tool of disinformation.
no code implementations • 19 May 2022 • Riccardo Fogliato, Shreya Chappidi, Matthew Lungren, Michael Fitzke, Mark Parkinson, Diane Wilson, Paul Fisher, Eric Horvitz, Kori Inkpen, Besmira Nushi
A critical aspect of interaction design for AI-assisted human decision making are policies about the display and sequencing of AI inferences within larger decision-making workflows.
no code implementations • 4 May 2022 • Tom Hope, Doug Downey, Oren Etzioni, Daniel S. Weld, Eric Horvitz
We stand at the foot of a significant inflection in the trajectory of scientific discovery.
no code implementations • 7 Jan 2022 • Maarten Sap, Anna Jafarpour, Yejin Choi, Noah A. Smith, James W. Pennebaker, Eric Horvitz
We quantify the differences between autobiographical and imagined stories by introducing sequentiality, a measure of narrative flow of events, drawing probabilistic inferences from a cutting-edge large language model (GPT-3).
no code implementations • 18 Oct 2021 • Eric Horvitz, John Breese
Thus, it is important to determine the portion of resources we wish to apply to metareasoning and control versus to the execution of a solution plan.
1 code implementation • NeurIPS Workshop AI4Scien 2021 • Dan Lahav, Jon Saad Falcon, Bailey Kuehl, Sophie Johnson, Sravanthi Parasa, Noam Shomron, Duen Horng Chau, Diyi Yang, Eric Horvitz, Daniel S. Weld, Tom Hope
To address this problem, we present a novel task of extraction and search of scientific challenges and directions, to facilitate rapid knowledge discovery.
no code implementations • NeurIPS Workshop AI4Scien 2021 • Jason Portenoy, Marissa Radensky, Jevin West, Eric Horvitz, Daniel Weld, Tom Hope
We also demonstrate an approach for displaying information about authors, boosting the ability to understand the work of new, unfamiliar scholars.
no code implementations • 25 Jun 2021 • Yu Wang, Jinchao Li, Tristan Naumann, Chenyan Xiong, Hao Cheng, Robert Tinn, Cliff Wong, Naoto Usuyama, Richard Rogahn, Zhihong Shen, Yang Qin, Eric Horvitz, Paul N. Bennett, Jianfeng Gao, Hoifung Poon
A prominent case in point is the explosion of the biomedical literature on COVID-19, which swelled to hundreds of thousands of papers in a matter of months.
1 code implementation • 29 Mar 2021 • Dan Bohus, Sean Andrist, Ashley Feniello, Nick Saw, Mihai Jalobeanu, Patrick Sweeney, Anne Loomis Thompson, Eric Horvitz
We introduce Platform for Situated Intelligence, an open-source framework created to support the rapid development and study of multimodal, integrative-AI systems.
no code implementations • 17 Feb 2021 • Kristina Gligorić, Ryen W. White, Emre Kiciman, Eric Horvitz, Arnaud Chiolero, Robert West
To estimate causal effects from the passively observed log data, we control confounds in a matched quasi-experimental design: we identify focal users who at first do not have any regular eating partners but then start eating with a fixed partner regularly, and we match focal users into comparison pairs such that paired users are nearly identical with respect to covariates measured before acquiring the partner, where the two focal users' new eating partners diverge in the healthiness of their respective food choice.
no code implementations • 1 Jan 2021 • Maggie Makar, Lauren West, David Hooper, Eric Horvitz, Erica Shenoy, John Guttag
In this work we ask: can we build reliable infection prediction models when the observed data is collected under limited, and biased testing that prioritizes testing symptomatic individuals?
1 code implementation • CVPR 2021 • Sahil Singla, Besmira Nushi, Shital Shah, Ece Kamar, Eric Horvitz
Traditional evaluation metrics for learned models that report aggregate scores over a test set are insufficient for surfacing important and informative patterns of failure over features and instances.
3 code implementations • NAACL 2021 • Tom Hope, Aida Amini, David Wadden, Madeleine van Zuylen, Sravanthi Parasa, Eric Horvitz, Daniel Weld, Roy Schwartz, Hannaneh Hajishirzi
The COVID-19 pandemic has spawned a diverse body of scientific literature that is challenging to navigate, stimulating interest in automated tools to help find useful knowledge.
no code implementations • 17 Aug 2020 • Jina Suh, Eric Horvitz, Ryen W. White, Tim Althoff
Most work to date on mitigating the COVID-19 pandemic is focused urgently on biomedicine and epidemiology.
no code implementations • 11 Aug 2020 • Megha Srivastava, Besmira Nushi, Ece Kamar, Shital Shah, Eric Horvitz
In many applications of machine learning (ML), updates are performed with the goal of enhancing model performance.
no code implementations • ACL 2020 • Maarten Sap, Eric Horvitz, Yejin Choi, Noah A. Smith, James Pennebaker
We introduce a measure of narrative flow and use this to examine the narratives for imagined and recalled events.
no code implementations • EMNLP 2020 • Tom Hope, Jason Portenoy, Kishore Vasan, Jonathan Borchardt, Eric Horvitz, Daniel S. Weld, Marti A. Hearst, Jevin West
The COVID-19 pandemic has sparked unprecedented mobilization of scientists, generating a deluge of papers that makes it hard for researchers to keep track and explore new directions.
no code implementations • 1 May 2020 • Bryan Wilder, Eric Horvitz, Ece Kamar
A rising vision for AI in the open world centers on the development of systems that can complement humans for perceptual, diagnostic, and reasoning tasks.
no code implementations • 27 Apr 2020 • Gagan Bansal, Besmira Nushi, Ece Kamar, Eric Horvitz, Daniel S. Weld
To optimize the team performance for this setting we maximize the team's expected utility, expressed in terms of the quality of the final decision, cost of verifying, and individual accuracies of people and machines.
no code implementations • CVPR 2020 • Ramprasaath R. Selvaraju, Purva Tendulkar, Devi Parikh, Eric Horvitz, Marco Ribeiro, Besmira Nushi, Ece Kamar
We quantify the extent to which this phenomenon occurs by creating a new Reasoning split of the VQA dataset and collecting VQA-introspect, a new dataset1 which consists of 238K new perception questions which serve as sub questions corresponding to the set of perceptual tasks needed to effectively answer the complex reasoning questions in the Reasoning split.
2 code implementations • NeurIPS 2019 • Aditya Grover, Jiaming Song, Alekh Agarwal, Kenneth Tran, Ashish Kapoor, Eric Horvitz, Stefano Ermon
A standard technique to correct this bias is importance sampling, where samples from the model are weighted by the likelihood ratio under model and true distributions.
no code implementations • 4 Jun 2019 • Gagan Bansal, Besmira Nushi, Ece Kamar, Dan Weld, Walter Lasecki, Eric Horvitz
We introduce the notion of the compatibility of an AI update with prior user experience and present methods for studying the role of compatibility in human-AI teams.
2 code implementations • NeurIPS 2019 • Hanzhang Hu, John Langford, Rich Caruana, Saurajit Mukherjee, Eric Horvitz, Debadeepta Dey
We propose a neural architecture search (NAS) algorithm, Petridish, to iteratively add shortcut connections to existing network layers.
no code implementations • 12 May 2019 • Aditya Modi, Debadeepta Dey, Alekh Agarwal, Adith Swaminathan, Besmira Nushi, Sean Andrist, Eric Horvitz
We address the opportunity to maximize the utility of an overall computing system by employing reinforcement learning to guide the configuration of the set of interacting modules that comprise the system.
no code implementations • ICLR Workshop DeepGenStruct 2019 • Aditya Grover, Jiaming Song, Ashish Kapoor, Kenneth Tran, Alekh Agarwal, Eric Horvitz, Stefano Ermon
A standard technique to correct this bias is by importance weighting samples from the model by the likelihood ratio under the model and true distributions.
1 code implementation • 10 Jan 2019 • Robert West, Eric Horvitz
Starting from the observation that satirical news headlines tend to resemble serious news headlines, we build and analyze a corpus of satirical headlines paired with nearly identical but serious headlines.
no code implementations • 19 Sep 2018 • Besmira Nushi, Ece Kamar, Eric Horvitz
We present Pandora, a set of hybrid human-machine methods and tools for describing and explaining system failures.
no code implementations • 23 May 2018 • Ramya Ramakrishnan, Ece Kamar, Debadeepta Dey, Julie Shah, Eric Horvitz
Agents trained in simulation may make errors in the real world due to mismatches between training and execution environments.
1 code implementation • 26 Sep 2017 • Emmanouil Antonios Platanios, Ashish Kapoor, Eric Horvitz
Structured prediction is ubiquitous in applications of machine learning such as knowledge extraction and natural language processing.
no code implementations • EMNLP 2017 • Nabil Hossain, John Krumm, V, Lucy erwende, Eric Horvitz, Henry Kautz
Computerized generation of humor is a notoriously difficult AI problem.
no code implementations • 13 Jul 2017 • Gregory D. Hager, Randal Bryant, Eric Horvitz, Maja Mataric, Vasant Honavar
Advances in Artificial Intelligence require progress across all of computer science.
no code implementations • NeurIPS 2017 • Emmanouil A. Platanios, Hoifung Poon, Tom M. Mitchell, Eric Horvitz
We propose an efficient method to estimate the accuracy of classifiers using only unlabeled data.
no code implementations • 24 Nov 2016 • Besmira Nushi, Ece Kamar, Eric Horvitz, Donald Kossmann
We study the problem of troubleshooting machine learning systems that rely on analytical pipelines of distinct components.
no code implementations • 28 Oct 2016 • Himabindu Lakkaraju, Ece Kamar, Rich Caruana, Eric Horvitz
Predictive models deployed in the real world may assign incorrect labels to instances with high confidence.
no code implementations • 16 Sep 2016 • Ethan Fast, Eric Horvitz
We find that discussion of AI has increased sharply since 2009, and that these discussions have been consistently more optimistic than pessimistic.
no code implementations • EMNLP 2016 • Ethan Fast, Eric Horvitz
When we use our predictive model to analyze millions of other Reddit posts, we find evidence that suggests dogmatism is a deeper personality trait, present for dogmatic users across many different domains, and that users who engage on dogmatic comments tend to show increases in dogmatic posts themselves.
no code implementations • 12 Aug 2015 • Adish Singla, Eric Horvitz, Pushmeet Kohli, Andreas Krause
Furthermore, we consider an embedding of the tasks and workers in an underlying graph that may arise from task similarities or social ties, and that can provide additional side-observations for faster learning.
no code implementations • 3 May 2015 • Christopher H. Lin, Andrey Kolobov, Ece Kamar, Eric Horvitz
Our work subsumes previously studied special cases of metareasoning and shows that in the general case, metareasoning is at most polynomially harder than solving MDPs with any given algorithm that disregards the cost of thinking.
no code implementations • 24 Apr 2015 • Adish Singla, Eric Horvitz, Pushmeet Kohli, Ryen White, Andreas Krause
How should we gather information in a network, where each node's visibility is limited to its local neighborhood?
no code implementations • 23 Jan 2015 • Ashish Kapoor, E. Paxon Frady, Stefanie Jegelka, William B. Kristan, Eric Horvitz
We introduce and study methods for inferring and learning from correspondences among neurons.
no code implementations • 22 Apr 2014 • Adish Singla, Eric Horvitz, Ece Kamar, Ryen White
Users may be willing to share private information in return for better quality of service or for incentives, or in return for assurances about the nature and extend of the logging of data.
no code implementations • 16 Jan 2014 • Andreas Krause, Eric Horvitz
We introduce and explore an economics of privacy in personalization, where people can opt to share personal information, in a standing or on-demand manner, in return for expected enhancements in the quality of an online service.
no code implementations • 13 Apr 2013 • Eric Horvitz, Finn Jensen
This is the Proceedings of the Twelfth Conference on Uncertainty in Artificial Intelligence, which was held in Portland, OR, August 1-4, 1996
no code implementations • NeurIPS 2012 • Jenna Wiens, Eric Horvitz, John V. Guttag
A patient's risk for adverse events is affected by temporal processes including the nature and timing of diagnostic and therapeutic activities, and the overall evolution of the patient's pathophysiology over time.
no code implementations • NeurIPS 2009 • Ashish Kapoor, Eric Horvitz
There has been a clear distinction between induction or training time and diagnosis time active information acquisition.