Search Results for author: Mark K. Ho

Found 20 papers, 4 papers with code

Exploring the hierarchical structure of human plans via program generation

2 code implementations • 30 Nov 2023 • Carlos G. Correa, Sophia Sanborn, Mark K. Ho, Frederick Callaway, Nathaniel D. Daw, Thomas L. Griffiths

We find that humans are sensitive to both metrics, but that both accounts fail to predict a qualitative feature of human-created programs, namely that people prefer programs with reuse over and above the predictions of MDL.

Paper
Code

Structurally guided task decomposition in spatial navigation tasks

no code implementations • 3 Oct 2023 • Ruiqi He, Carlos G. Correa, Thomas L. Griffiths, Mark K. Ho

How are people able to plan so efficiently despite limited cognitive resources?

Paper
Add Code

Bayesian Reinforcement Learning with Limited Cognitive Load

no code implementations • 5 May 2023 • Dilip Arumugam, Mark K. Ho, Noah D. Goodman, Benjamin Van Roy

All biological and artificial agents must learn and make decisions given limits on their ability to process information.

Decision Making reinforcement-learning

Paper
Add Code

Humans decompose tasks by trading off utility and computational cost

no code implementations • 7 Nov 2022 • Carlos G. Correa, Mark K. Ho, Frederick Callaway, Nathaniel D. Daw, Thomas L. Griffiths

Human behavior emerges from planning over elaborate decompositions of tasks into goals, subgoals, and low-level actions.

Paper
Add Code

On Rate-Distortion Theory in Capacity-Limited Cognition & Reinforcement Learning

no code implementations • 30 Oct 2022 • Dilip Arumugam, Mark K. Ho, Noah D. Goodman, Benjamin Van Roy

Throughout the cognitive-science literature, there is widespread agreement that decision-making agents operating in the real world do so under limited information-processing capabilities and without access to unbounded cognitive or computational resources.

Decision Making reinforcement-learning +1

Paper
Add Code

Linguistic communication as (inverse) reward design

no code implementations • 11 Apr 2022 • Theodore R. Sumers, Robert D. Hawkins, Mark K. Ho, Thomas L. Griffiths, Dylan Hadfield-Menell

We then define a pragmatic listener which performs inverse reward design by jointly inferring the speaker's latent horizon and rewards.

Paper
Add Code

On the Expressivity of Markov Reward

no code implementations • NeurIPS 2021 • David Abel, Will Dabney, Anna Harutyunyan, Mark K. Ho, Michael L. Littman, Doina Precup, Satinder Singh

We then provide a set of polynomial-time algorithms that construct a Markov reward function that allows an agent to optimize tasks of each of these three types, and correctly determine when no such reward function exists.

Paper
Add Code

Cognitive science as a source of forward and inverse models of human decisions for robotics and control

no code implementations • 1 Sep 2021 • Mark K. Ho, Thomas L. Griffiths

Those designing autonomous systems that interact with humans will invariably face questions about how humans think and make decisions.

Decision Making

Paper
Add Code

Extending rational models of communication from beliefs to actions

1 code implementation • 25 May 2021 • Theodore R. Sumers, Robert D. Hawkins, Mark K. Ho, Thomas L. Griffiths

Speakers communicate to influence their partner's beliefs and shape their actions.

Paper
Code

People construct simplified mental representations to plan

no code implementations • 14 May 2021 • Mark K. Ho, David Abel, Carlos G. Correa, Michael L. Littman, Jonathan D. Cohen, Thomas L. Griffiths

We propose a computational account of this simplification process and, in a series of pre-registered behavioral experiments, show that it is subject to online cognitive control and that people optimally balance the complexity of a task representation and its utility for planning and acting.

Paper
Add Code

Show or Tell? Demonstration is More Robust to Changes in Shared Perception than Explanation

no code implementations • 16 Dec 2020 • Theodore R. Sumers, Mark K. Ho, Thomas L. Griffiths

Nonetheless, a teacher and learner may not always experience or attend to the same aspects of the environment.

Paper
Add Code

Learning Rewards from Linguistic Feedback

1 code implementation • 30 Sep 2020 • Theodore R. Sumers, Mark K. Ho, Robert D. Hawkins, Karthik Narasimhan, Thomas L. Griffiths

The sentiment models outperform the inference network, with the "pragmatic" model approaching human performance.

Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +2

Paper
Code

Using Machine Teaching to Investigate Human Assumptions when Teaching Reinforcement Learners

no code implementations • 5 Sep 2020 • Yun-Shiuan Chuang, Xuezhou Zhang, Yuzhe ma, Mark K. Ho, Joseph L. Austerweil, Xiaojin Zhu

To solve the machine teaching optimization problem, we use a deep learning approximation method which simulates learners in the environment and learns to predict how feedback affects the learner's internal states.

Q-Learning

Paper
Add Code

Resource-rational Task Decomposition to Minimize Planning Costs

no code implementations • 27 Jul 2020 • Carlos G. Correa, Mark K. Ho, Fred Callaway, Thomas L. Griffiths

That is, rather than planning over a monolithic representation of a task, they decompose the task into simpler subtasks and then plan to accomplish those.

Paper
Add Code

The Efficiency of Human Cognition Reflects Planned Information Processing

no code implementations • 13 Feb 2020 • Mark K. Ho, David Abel, Jonathan D. Cohen, Michael L. Littman, Thomas L. Griffiths

Thus, people should plan their actions, but they should also be smart about how they deploy resources used for planning their actions.

Paper
Add Code

On the Utility of Learning about Humans for Human-AI Coordination

2 code implementations • NeurIPS 2019 • Micah Carroll, Rohin Shah, Mark K. Ho, Thomas L. Griffiths, Sanjit A. Seshia, Pieter Abbeel, Anca Dragan

While we would like agents that can coordinate with humans, current algorithms such as self-play and population-based training create agents that can coordinate with themselves.

635

Paper
Code

The Computational Structure of Unintentional Meaning

no code implementations • 3 Jun 2019 • Mark K. Ho, Joanna Korman, Thomas L. Griffiths

Speech-acts can have literal meaning as well as pragmatic meaning, but these both involve consequences typically intended by a speaker.

Paper
Add Code

Learning Task Specifications from Demonstrations

no code implementations • NeurIPS 2018 • Marcell Vazquez-Chanlatte, Susmit Jha, Ashish Tiwari, Mark K. Ho, Sanjit A. Seshia

In this paper, we formulate the specification inference task as a maximum a posteriori (MAP) probability inference problem, apply the principle of maximum entropy to derive an analytic demonstration likelihood model and give an efficient approach to search for the most likely specification in a large candidate pool of specifications.

Paper
Add Code

Interactive Learning from Policy-Dependent Human Feedback

no code implementations • ICML 2017 • James MacGlashan, Mark K. Ho, Robert Loftin, Bei Peng, Guan Wang, David Roberts, Matthew E. Taylor, Michael L. Littman

This paper investigates the problem of interactively learning behaviors communicated by a human teacher using positive and negative feedback.

Paper
Add Code

Showing versus doing: Teaching by demonstration

no code implementations • NeurIPS 2016 • Mark K. Ho, Michael Littman, James Macglashan, Fiery Cushman, Joseph L. Austerweil

Stark differences arise when demonstrators are intentionally teaching a task versus simply performing a task.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.