Search Results for author: Marcus Hutter

Found 79 papers, 18 papers with code

Learning Universal Predictors

1 code implementation26 Jan 2024 Jordi Grau-Moya, Tim Genewein, Marcus Hutter, Laurent Orseau, Grégoire Delétang, Elliot Catt, Anian Ruoss, Li Kevin Wenliang, Christopher Mattern, Matthew Aitchison, Joel Veness

Meta-learning has emerged as a powerful approach to train neural networks to learn new tasks quickly from limited data.

Meta-Learning

Dynamic Knowledge Injection for AIXI Agents

no code implementations18 Dec 2023 Samuel Yang-Zhao, Kee Siong Ng, Marcus Hutter

Prior approximations of AIXI, a Bayesian optimality notion for general reinforcement learning, can only approximate AIXI's Bayesian environment model using an a-priori defined set of models.

General Reinforcement Learning

Distributional Bellman Operators over Mean Embeddings

1 code implementation9 Dec 2023 Li Kevin Wenliang, Grégoire Delétang, Matthew Aitchison, Marcus Hutter, Anian Ruoss, Arthur Gretton, Mark Rowland

We propose a novel algorithmic framework for distributional reinforcement learning, based on learning finite-dimensional mean embeddings of return distributions.

Atari Games Distributional Reinforcement Learning +1

Bridging Algorithmic Information Theory and Machine Learning: A New Approach to Kernel Learning

no code implementations21 Nov 2023 Boumediene Hamzi, Marcus Hutter, Houman Owhadi

Machine Learning (ML) and Algorithmic Information Theory (AIT) look at Complexity from different points of view.

Language Modeling Is Compression

1 code implementation19 Sep 2023 Grégoire Delétang, Anian Ruoss, Paul-Ambroise Duquenne, Elliot Catt, Tim Genewein, Christopher Mattern, Jordi Grau-Moya, Li Kevin Wenliang, Matthew Aitchison, Laurent Orseau, Marcus Hutter, Joel Veness

We show that large language models are powerful general-purpose predictors and that the compression viewpoint provides novel insights into scaling laws, tokenization, and in-context learning.

In-Context Learning Language Modelling

Line Search for Convex Minimization

no code implementations31 Jul 2023 Laurent Orseau, Marcus Hutter

However, to the best of our knowledge, there is no principled exact line search algorithm for general convex functions -- including piecewise-linear and max-compositions of convex functions -- that takes advantage of convexity.

Combining a Meta-Policy and Monte-Carlo Planning for Scalable Type-Based Reasoning in Partially Observable Environments

1 code implementation9 Jun 2023 Jonathon Schwartz, Hanna Kurniawati, Marcus Hutter

The design of autonomous agents that can interact effectively with other agents without prior coordination is a core problem in multi-agent systems.

Levin Tree Search with Context Models

1 code implementation26 May 2023 Laurent Orseau, Marcus Hutter, Levi H. S. Lelis

Levin Tree Search (LTS) is a search algorithm that makes use of a policy (a probability distribution over actions) and comes with a theoretical guarantee on the number of expansions before reaching a goal node, depending on the quality of the policy.

Rubik's Cube

Evaluating Representations with Readout Model Switching

no code implementations19 Feb 2023 Yazhe Li, Jorg Bornschein, Marcus Hutter

Although much of the success of Deep Learning builds on learning good representations, a rigorous method to evaluate their quality is lacking.

Model Selection

Universal Agent Mixtures and the Geometry of Intelligence

no code implementations13 Feb 2023 Samuel Allen Alexander, David Quarel, Len Du, Marcus Hutter

Thus, if RL agent intelligence is quantified in terms of performance across environments, the weighted mixture's intelligence is the weighted average of the original agents' intelligences.

Multi-agent Reinforcement Learning Reinforcement Learning (RL)

U-Clip: On-Average Unbiased Stochastic Gradient Clipping

no code implementations6 Feb 2023 Bryn Elesedy, Marcus Hutter

U-Clip is a simple amendment to gradient clipping that can be applied to any iterative gradient optimization algorithm.

LEMMA

Generalization Bounds for Few-Shot Transfer Learning with Pretrained Classifiers

no code implementations23 Dec 2022 Tomer Galanti, András György, Marcus Hutter

We study the ability of foundation models to learn representations for classification that are transferable to new, unseen classes.

Few-Shot Learning Generalization Bounds +1

Testing Independence of Exchangeable Random Variables

no code implementations22 Oct 2022 Marcus Hutter

Given well-shuffled data, can we determine whether the data items are statistically (in)dependent?

Sequential Learning Of Neural Networks for Prequential MDL

no code implementations14 Oct 2022 Jorg Bornschein, Yazhe Li, Marcus Hutter

In the prequential formulation of MDL, the objective is to minimize the cumulative next-step log-loss when sequentially going through the data and using previous observations for parameter estimation.

Image Classification

Atari-5: Distilling the Arcade Learning Environment down to Five Games

1 code implementation5 Oct 2022 Matthew Aitchison, Penny Sweetser, Marcus Hutter

The Arcade Learning Environment (ALE) has become an essential benchmark for assessing the performance of reinforcement learning algorithms.

Atari Games

Beyond Bayes-optimality: meta-learning what you know you don't know

no code implementations30 Sep 2022 Jordi Grau-Moya, Grégoire Delétang, Markus Kunesch, Tim Genewein, Elliot Catt, Kevin Li, Anian Ruoss, Chris Cundy, Joel Veness, Jane Wang, Marcus Hutter, Christopher Summerfield, Shane Legg, Pedro Ortega

This is in contrast to risk-sensitive agents, which additionally exploit the higher-order moments of the return, and ambiguity-sensitive agents, which act differently when recognizing situations in which they lack knowledge.

Decision Making Meta-Learning

Formal Algorithms for Transformers

1 code implementation19 Jul 2022 Mary Phuong, Marcus Hutter

This document aims to be a self-contained, mathematically precise overview of transformer architectures and algorithms (*not* results).

Uniqueness and Complexity of Inverse MDP Models

no code implementations2 Jun 2022 Marcus Hutter, Steven Hansen

In the traditional "forward" view, transition "matrix" p(s'|sa) and policy {\pi}(a|s) uniquely determine "everything": the whole dynamics p(as'a's"a"...|s), and with it, the action-conditional state process p(s's"...|saa'a"), the multi-step inverse models p(aa'a"...|ss^i), etc.

On the Role of Neural Collapse in Transfer Learning

no code implementations ICLR 2022 Tomer Galanti, András György, Marcus Hutter

We study the ability of foundation models to learn representations for classification that are transferable to new, unseen classes.

Clustering Few-Shot Learning +1

Isotuning With Applications To Scale-Free Online Learning

no code implementations29 Dec 2021 Laurent Orseau, Marcus Hutter

We extend and combine several tools of the literature to design fast, adaptive, anytime and scale-free online learning algorithms.

Reducing Planning Complexity of General Reinforcement Learning with Non-Markovian Abstractions

no code implementations26 Dec 2021 Sultan J. Majeed, Marcus Hutter

A distinguishing feature of ESA is that it proves an upper bound of $O\left(\varepsilon^{-A} \cdot (1-\gamma)^{-2A}\right)$ on the number of states required for the surrogate MDP (where $A$ is the number of actions, $\gamma$ is the discount-factor, and $\varepsilon$ is the optimality-gap) which holds \emph{uniformly} for \emph{all} domains.

Decision Making General Reinforcement Learning +2

Reinforcement Learning with Information-Theoretic Actuation

no code implementations30 Sep 2021 Elliot Catt, Marcus Hutter, Joel Veness

In this work we explore and formalize a contrasting view, namely that actions are best thought of as the output of a sequence of internal choices with respect to an action model.

reinforcement-learning Reinforcement Learning (RL)

Intelligence and Unambitiousness Using Algorithmic Information Theory

no code implementations13 May 2021 Michael K. Cohen, Badri Vellambi, Marcus Hutter

Algorithmic Information Theory has inspired intractable constructions of general intelligence (AGI), and undiscovered tractable approximations are likely feasible.

Reinforcement Learning (RL)

Fully General Online Imitation Learning

no code implementations17 Feb 2021 Michael K. Cohen, Marcus Hutter, Neel Nanda

If we run an imitator, we probably want events to unfold similarly to the way they would have if the demonstrator had been acting the whole time.

Imitation Learning

Learning Curve Theory

no code implementations8 Feb 2021 Marcus Hutter

Theoretical understanding of this phenomenon is largely lacking, except in finite-dimensional models for which error typically decreases with $n^{-1/2}$ or $n^{-1}$, where $n$ is the sample size.

Exact Reduction of Huge Action Spaces in General Reinforcement Learning

no code implementations18 Dec 2020 Sultan Javed Majeed, Marcus Hutter

In this work we show how action-binarization in the non-MDP case can significantly improve Extreme State Aggregation (ESA) bounds.

Binarization General Reinforcement Learning +4

A Combinatorial Perspective on Transfer Learning

1 code implementation NeurIPS 2020 Jianan Wang, Eren Sezener, David Budden, Marcus Hutter, Joel Veness

Our main postulate is that the combination of task segmentation, modular learning and memory-based ensembling can give rise to generalization on an exponentially growing number of unseen tasks.

Continual Learning Transfer Learning

On Representing (Anti)Symmetric Functions

no code implementations30 Jul 2020 Marcus Hutter

Permutation-invariant, -equivariant, and -covariant functions and anti-symmetric functions are important in quantum physics, computer vision, and other disciplines.

Logarithmic Pruning is All You Need

no code implementations NeurIPS 2020 Laurent Orseau, Marcus Hutter, Omar Rivasplata

The Lottery Ticket Hypothesis is a conjecture that every large neural network contains a subnetwork that, when trained in isolation, achieves comparable performance to the large network.

Pessimism About Unknown Unknowns Inspires Conservatism

no code implementations15 Jun 2020 Michael K. Cohen, Marcus Hutter

Our other main contribution is that the agent's policy's value approaches at least that of the mentor, while the probability of deferring to the mentor goes to 0.

Online Learning in Contextual Bandits using Gated Linear Networks

no code implementations NeurIPS 2020 Eren Sezener, Marcus Hutter, David Budden, Jianan Wang, Joel Veness

We introduce a new and completely online contextual bandit algorithm called Gated Linear Contextual Bandits (GLCB).

Multi-Armed Bandits

Fairness without Regret

no code implementations11 Jul 2019 Marcus Hutter

A popular approach of achieving fairness in optimization problems is by constraining the solution space to "fair" solutions, which unfortunately typically reduces solution quality.

Fairness

Asymptotically Unambitious Artificial General Intelligence

no code implementations29 May 2019 Michael K. Cohen, Badri Vellambi, Marcus Hutter

General intelligence, the ability to solve arbitrary solvable problems, is supposed by many to be artificially constructible.

Self-Driving Cars

Conditions on Features for Temporal Difference-Like Methods to Converge

no code implementations28 May 2019 Marcus Hutter, Samuel Yang-Zhao, Sultan J. Majeed

The convergence of many reinforcement learning (RL) algorithms with linear function approximation has been investigated extensively but most proofs assume that these methods converge to a unique solution.

reinforcement-learning Reinforcement Learning (RL) +1

A Strongly Asymptotically Optimal Agent in General Environments

no code implementations4 Mar 2019 Michael K. Cohen, Elliot Catt, Marcus Hutter

This is known as strong asymptotic optimality, and it was previously unknown whether it was possible for a policy to be strongly asymptotically optimal in the class of all computable probabilistic environments.

Performance Guarantees for Homomorphisms Beyond Markov Decision Processes

no code implementations9 Nov 2018 Sultan Javed Majeed, Marcus Hutter

However, we show that near-optimal performance is sometimes guaranteed even if the homomorphism is non-Markovian.

AGI Safety Literature Review

no code implementations3 May 2018 Tom Everitt, Gary Lea, Marcus Hutter

The development of Artificial General Intelligence (AGI) promises to be a major event.

Count-Based Exploration in Feature Space for Reinforcement Learning

1 code implementation25 Jun 2017 Jarryd Martin, Suraj Narayanan Sasikumar, Tom Everitt, Marcus Hutter

We present a new method for computing a generalised state visit-count, which allows the agent to estimate the uncertainty associated with any state.

Atari Games Efficient Exploration +2

Universal Reinforcement Learning Algorithms: Survey and Experiments

1 code implementation30 May 2017 John Aslanides, Jan Leike, Marcus Hutter

Many state-of-the-art reinforcement learning (RL) algorithms typically assume that the environment is an ergodic Markov Decision Process (MDP).

reinforcement-learning Reinforcement Learning (RL)

Reinforcement Learning with a Corrupted Reward Channel

1 code implementation23 May 2017 Tom Everitt, Victoria Krakovna, Laurent Orseau, Marcus Hutter, Shane Legg

Traditional RL methods fare poorly in CRMDPs, even under strong simplifying assumptions and when trying to compensate for the possibly corrupt rewards.

reinforcement-learning Reinforcement Learning (RL)

Generalised Discount Functions applied to a Monte-Carlo AImu Implementation

1 code implementation3 Mar 2017 Sean Lamont, John Aslanides, Jan Leike, Marcus Hutter

We have added to the GRL simulation platform AIXIjs the functionality to assign an agent arbitrary discount functions, and an environment which can be used to determine the effect of discounting on an agent's policy.

General Reinforcement Learning reinforcement-learning +1

Free Lunch for Optimisation under the Universal Distribution

no code implementations16 Aug 2016 Tom Everitt, Tor Lattimore, Marcus Hutter

Function optimisation is a major challenge in computer science.

Death and Suicide in Universal Artificial Intelligence

no code implementations2 Jun 2016 Jarryd Martin, Tom Everitt, Marcus Hutter

Reinforcement learning (RL) is a general paradigm for studying intelligent behaviour, with applications ranging from artificial intelligence to psychology and economics.

Reinforcement Learning (RL)

Avoiding Wireheading with Value Reinforcement Learning

no code implementations10 May 2016 Tom Everitt, Marcus Hutter

Unfortunately, RL does not work well for generally intelligent agents, as RL agents are incentivised to shortcut the reward sensor for maximum reward -- the so-called wireheading problem.

reinforcement-learning Reinforcement Learning (RL)

Self-Modification of Policy and Utility Function in Rational Agents

no code implementations10 May 2016 Tom Everitt, Daniel Filan, Mayank Daswani, Marcus Hutter

As we continue to create more and more intelligent agents, chances increase that they will learn about this ability.

General Reinforcement Learning

Loss Bounds and Time Complexity for Speed Priors

no code implementations12 Apr 2016 Daniel Filan, Marcus Hutter, Jan Leike

On a polynomial time computable sequence our speed prior is computable in exponential time.

Thompson Sampling is Asymptotically Optimal in General Environments

no code implementations25 Feb 2016 Jan Leike, Tor Lattimore, Laurent Orseau, Marcus Hutter

We discuss a variant of Thompson sampling for nonparametric reinforcement learning in a countable classes of general stochastic environments.

reinforcement-learning Reinforcement Learning (RL) +1

On the Computability of AIXI

no code implementations19 Oct 2015 Jan Leike, Marcus Hutter

Solomonoff induction and the reinforcement learning agent AIXI are proposed answers to this question.

BIG-bench Machine Learning reinforcement-learning +1

Bad Universal Priors and Notions of Optimality

no code implementations16 Oct 2015 Jan Leike, Marcus Hutter

A big open question of algorithmic information theory is the choice of the universal Turing machine (UTM).

Open-Ended Question Answering

Solomonoff Induction Violates Nicod's Criterion

no code implementations15 Jul 2015 Jan Leike, Marcus Hutter

Nicod's criterion states that observing a black raven is evidence for the hypothesis H that all ravens are black.

On the Computability of Solomonoff Induction and Knowledge-Seeking

no code implementations15 Jul 2015 Jan Leike, Marcus Hutter

Solomonoff induction is held as a gold standard for learning, but it is known to be incomputable.

reinforcement-learning Reinforcement Learning (RL)

Sequential Extensions of Causal and Evidential Decision Theory

no code implementations24 Jun 2015 Tom Everitt, Jan Leike, Marcus Hutter

Moving beyond the dualistic view in AI where agent and environment are separated incurs new challenges for decision making, as calculation of expected utility is no longer straightforward.

Decision Making

Compress and Control

no code implementations19 Nov 2014 Joel Veness, Marc G. Bellemare, Marcus Hutter, Alvin Chua, Guillaume Desjardins

This paper describes a new information-theoretic policy evaluation technique for reinforcement learning.

Reinforcement Learning (RL)

Indefinitely Oscillating Martingales

no code implementations14 Aug 2014 Jan Leike, Marcus Hutter

We construct a class of nonnegative martingale processes that oscillate indefinitely with high probability.

Robust Feature Selection by Mutual Information Distributions

no code implementations7 Aug 2014 Marco Zaffalon, Marcus Hutter

Mutual information is widely used in artificial intelligence, in a descriptive way, to measure the stochastic dependence of discrete random variables.

Descriptive feature selection +1

Offline to Online Conversion

no code implementations12 Jul 2014 Marcus Hutter

We consider the problem of converting offline estimators into an online predictor or estimator with small extra regret.

Extreme State Aggregation Beyond MDPs

no code implementations12 Jul 2014 Marcus Hutter

We consider a Reinforcement Learning setup where an agent interacts with an environment in observation-reward-action cycles without any (esp.\ MDP) assumptions on the environment.

reinforcement-learning Reinforcement Learning (RL)

Online Learning of k-CNF Boolean Functions

no code implementations26 Mar 2014 Joel Veness, Marcus Hutter

This paper revisits the problem of learning a k-CNF Boolean function from examples in the context of online learning under the logarithmic loss.

PAC learning

A Novel Illumination-Invariant Loss for Monocular 3D Pose Estimation

no code implementations28 Nov 2013 Srimal Jayawardena, Marcus Hutter, Nathan Brewer

Our proposed method of registering a 3D model of a known object on a given 2D photo of the object has numerous advantages over existing methods.

3D Pose Estimation Object

The Sample-Complexity of General Reinforcement Learning

no code implementations22 Aug 2013 Tor Lattimore, Marcus Hutter, Peter Sunehag

We present a new algorithm for general reinforcement learning where the true environment is known to belong to a finite class of N arbitrary models.

General Reinforcement Learning reinforcement-learning +1

Concentration and Confidence for Discrete Bayesian Sequence Predictors

no code implementations29 Jun 2013 Tor Lattimore, Marcus Hutter, Peter Sunehag

We prove tight high-probability bounds on the cumulative error, which is measured in terms of the Kullback-Leibler (KL) divergence.

Context Tree Switching

1 code implementation14 Nov 2011 Joel Veness, Kee Siong Ng, Marcus Hutter, Michael Bowling

This paper describes the Context Tree Switching technique, a modification of Context Tree Weighting for the prediction of binary, stationary, n-Markov sources.

Information Theory Information Theory

Discrete MDL Predicts in Total Variation

no code implementations NeurIPS 2009 Marcus Hutter

The Minimum Description Length (MDL) principle selects the model that has the shortest code for data plus model.

reinforcement-learning Reinforcement Learning (RL) +2

A Monte Carlo AIXI Approximation

2 code implementations4 Sep 2009 Joel Veness, Kee Siong Ng, Marcus Hutter, William Uther, David Silver

This paper introduces a principled approach for the design of a scalable general reinforcement learning agent.

General Reinforcement Learning Open-Ended Question Answering +2

Universal Intelligence: A Definition of Machine Intelligence

no code implementations20 Dec 2007 Shane Legg, Marcus Hutter

Finally, we survey the many other tests and definitions of intelligence that have been proposed for machines.

Cannot find the paper you are looking for? You can Submit a new open access paper.