Search Results for author: Mark K. Ho

Found 20 papers, 4 papers with code

Exploring the hierarchical structure of human plans via program generation

2 code implementations30 Nov 2023 Carlos G. Correa, Sophia Sanborn, Mark K. Ho, Frederick Callaway, Nathaniel D. Daw, Thomas L. Griffiths

We find that humans are sensitive to both metrics, but that both accounts fail to predict a qualitative feature of human-created programs, namely that people prefer programs with reuse over and above the predictions of MDL.

Structurally guided task decomposition in spatial navigation tasks

no code implementations3 Oct 2023 Ruiqi He, Carlos G. Correa, Thomas L. Griffiths, Mark K. Ho

How are people able to plan so efficiently despite limited cognitive resources?

Bayesian Reinforcement Learning with Limited Cognitive Load

no code implementations5 May 2023 Dilip Arumugam, Mark K. Ho, Noah D. Goodman, Benjamin Van Roy

All biological and artificial agents must learn and make decisions given limits on their ability to process information.

Decision Making reinforcement-learning

Humans decompose tasks by trading off utility and computational cost

no code implementations7 Nov 2022 Carlos G. Correa, Mark K. Ho, Frederick Callaway, Nathaniel D. Daw, Thomas L. Griffiths

Human behavior emerges from planning over elaborate decompositions of tasks into goals, subgoals, and low-level actions.

On Rate-Distortion Theory in Capacity-Limited Cognition & Reinforcement Learning

no code implementations30 Oct 2022 Dilip Arumugam, Mark K. Ho, Noah D. Goodman, Benjamin Van Roy

Throughout the cognitive-science literature, there is widespread agreement that decision-making agents operating in the real world do so under limited information-processing capabilities and without access to unbounded cognitive or computational resources.

Decision Making reinforcement-learning +1

Linguistic communication as (inverse) reward design

no code implementations11 Apr 2022 Theodore R. Sumers, Robert D. Hawkins, Mark K. Ho, Thomas L. Griffiths, Dylan Hadfield-Menell

We then define a pragmatic listener which performs inverse reward design by jointly inferring the speaker's latent horizon and rewards.

On the Expressivity of Markov Reward

no code implementations NeurIPS 2021 David Abel, Will Dabney, Anna Harutyunyan, Mark K. Ho, Michael L. Littman, Doina Precup, Satinder Singh

We then provide a set of polynomial-time algorithms that construct a Markov reward function that allows an agent to optimize tasks of each of these three types, and correctly determine when no such reward function exists.

Cognitive science as a source of forward and inverse models of human decisions for robotics and control

no code implementations1 Sep 2021 Mark K. Ho, Thomas L. Griffiths

Those designing autonomous systems that interact with humans will invariably face questions about how humans think and make decisions.

Decision Making

Extending rational models of communication from beliefs to actions

1 code implementation25 May 2021 Theodore R. Sumers, Robert D. Hawkins, Mark K. Ho, Thomas L. Griffiths

Speakers communicate to influence their partner's beliefs and shape their actions.

People construct simplified mental representations to plan

no code implementations14 May 2021 Mark K. Ho, David Abel, Carlos G. Correa, Michael L. Littman, Jonathan D. Cohen, Thomas L. Griffiths

We propose a computational account of this simplification process and, in a series of pre-registered behavioral experiments, show that it is subject to online cognitive control and that people optimally balance the complexity of a task representation and its utility for planning and acting.

Show or Tell? Demonstration is More Robust to Changes in Shared Perception than Explanation

no code implementations16 Dec 2020 Theodore R. Sumers, Mark K. Ho, Thomas L. Griffiths

Nonetheless, a teacher and learner may not always experience or attend to the same aspects of the environment.

Using Machine Teaching to Investigate Human Assumptions when Teaching Reinforcement Learners

no code implementations5 Sep 2020 Yun-Shiuan Chuang, Xuezhou Zhang, Yuzhe ma, Mark K. Ho, Joseph L. Austerweil, Xiaojin Zhu

To solve the machine teaching optimization problem, we use a deep learning approximation method which simulates learners in the environment and learns to predict how feedback affects the learner's internal states.

Q-Learning

Resource-rational Task Decomposition to Minimize Planning Costs

no code implementations27 Jul 2020 Carlos G. Correa, Mark K. Ho, Fred Callaway, Thomas L. Griffiths

That is, rather than planning over a monolithic representation of a task, they decompose the task into simpler subtasks and then plan to accomplish those.

The Efficiency of Human Cognition Reflects Planned Information Processing

no code implementations13 Feb 2020 Mark K. Ho, David Abel, Jonathan D. Cohen, Michael L. Littman, Thomas L. Griffiths

Thus, people should plan their actions, but they should also be smart about how they deploy resources used for planning their actions.

On the Utility of Learning about Humans for Human-AI Coordination

2 code implementations NeurIPS 2019 Micah Carroll, Rohin Shah, Mark K. Ho, Thomas L. Griffiths, Sanjit A. Seshia, Pieter Abbeel, Anca Dragan

While we would like agents that can coordinate with humans, current algorithms such as self-play and population-based training create agents that can coordinate with themselves.

The Computational Structure of Unintentional Meaning

no code implementations3 Jun 2019 Mark K. Ho, Joanna Korman, Thomas L. Griffiths

Speech-acts can have literal meaning as well as pragmatic meaning, but these both involve consequences typically intended by a speaker.

Learning Task Specifications from Demonstrations

no code implementations NeurIPS 2018 Marcell Vazquez-Chanlatte, Susmit Jha, Ashish Tiwari, Mark K. Ho, Sanjit A. Seshia

In this paper, we formulate the specification inference task as a maximum a posteriori (MAP) probability inference problem, apply the principle of maximum entropy to derive an analytic demonstration likelihood model and give an efficient approach to search for the most likely specification in a large candidate pool of specifications.

Interactive Learning from Policy-Dependent Human Feedback

no code implementations ICML 2017 James MacGlashan, Mark K. Ho, Robert Loftin, Bei Peng, Guan Wang, David Roberts, Matthew E. Taylor, Michael L. Littman

This paper investigates the problem of interactively learning behaviors communicated by a human teacher using positive and negative feedback.

Cannot find the paper you are looking for? You can Submit a new open access paper.