Search Results for author: Marlos C. Machado

Found 36 papers, 14 papers with code

Compound Returns Reduce Variance in Reinforcement Learning

no code implementations • 6 Feb 2024 • Brett Daley, Martha White, Marlos C. Machado

Multistep returns, such as $n$-step returns and $\lambda$-returns, are commonly used to improve the sample efficiency of reinforcement learning (RL) methods.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

GVFs in the Real World: Making Predictions Online for Water Treatment

no code implementations • 4 Dec 2023 • Muhammad Kamran Janjua, Haseeb Shah, Martha White, Erfan Miahi, Marlos C. Machado, Adam White

In this paper we investigate the use of reinforcement-learning based prediction approaches for a real drinking-water treatment plant.

Time Series Prediction

Paper
Add Code

Harnessing Discrete Representations For Continual Reinforcement Learning

no code implementations • 2 Dec 2023 • Edan Meyer, Adam White, Marlos C. Machado

In this work, we provide a thorough empirical investigation of the advantages of representing observations as vectors of categorical values within the context of reinforcement learning.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Directions of Curvature as an Explanation for Loss of Plasticity

no code implementations • 30 Nov 2023 • Alex Lewandowski, Haruto Tanaka, Dale Schuurmans, Marlos C. Machado

Loss of plasticity is a phenomenon in which neural networks lose their ability to learn from new experience.

Continual Learning

Paper
Add Code

Recurrent Linear Transformers

1 code implementation • 24 Oct 2023 • Subhojeet Pramanik, Esraa Elelimy, Marlos C. Machado, Adam White

In this paper we introduce recurrent alternatives to the transformer self-attention mechanism that offer a context-independent inference cost, leverage long-range dependencies effectively, and perform well in practice.

Paper
Code

Proper Laplacian Representation Learning

2 code implementations • 16 Oct 2023 • Diego Gomez, Michael Bowling, Marlos C. Machado

The ability to learn good representations of states is essential for solving large reinforcement learning problems, where exploration, generalization, and transfer are particularly challenging.

Representation Learning

Paper
Code

Loss of Plasticity in Continual Deep Reinforcement Learning

no code implementations • 13 Mar 2023 • Zaheer Abbas, Rosie Zhao, Joseph Modayil, Adam White, Marlos C. Machado

The ability to learn continually is essential in a complex and changing world.

Atari Games Continual Learning +2

Paper
Add Code

Trajectory-Aware Eligibility Traces for Off-Policy Reinforcement Learning

1 code implementation • 26 Jan 2023 • Brett Daley, Martha White, Christopher Amato, Marlos C. Machado

Off-policy learning from multistep returns is crucial for sample-efficient reinforcement learning, but counteracting off-policy bias without exacerbating variance is challenging.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Deep Laplacian-based Options for Temporally-Extended Exploration

1 code implementation • 26 Jan 2023 • Martin Klissarov, Marlos C. Machado

In this paper we address these limitations and show how recent results for directly approximating the eigenfunctions of the Laplacian can be leveraged to truly scale up options-based exploration.

Reinforcement Learning (RL)

Paper
Code

Agent-State Construction with Auxiliary Inputs

1 code implementation • 15 Nov 2022 • Ruo Yu Tao, Adam White, Marlos C. Machado

Finally, we show that this approach is complementary to state-of-the-art methods such as recurrent neural networks and truncated back-propagation through time, and acts as a heuristic that facilitates longer temporal credit assignment, leading to better performance.

Decision Making reinforcement-learning +1

Paper
Code

Investigating the Properties of Neural Network Representations in Reinforcement Learning

no code implementations • 30 Mar 2022 • Han Wang, Erfan Miahi, Martha White, Marlos C. Machado, Zaheer Abbas, Raksha Kumaraswamy, Vincent Liu, Adam White

In this paper we investigate the properties of representations learned by deep reinforcement learning systems.

Q-Learning reinforcement-learning +2

Paper
Add Code

Temporal Abstractions-Augmented Temporally Contrastive Learning: An Alternative to the Laplacian in RL

no code implementations • 21 Mar 2022 • Akram Erraqabi, Marlos C. Machado, Mingde Zhao, Sainbayar Sukhbaatar, Alessandro Lazaric, Ludovic Denoyer, Yoshua Bengio

In reinforcement learning, the graph Laplacian has proved to be a valuable tool in the task-agnostic setting, with applications ranging from skill discovery to reward shaping.

Continuous Control Contrastive Learning +1

Paper
Add Code

Reward-Respecting Subtasks for Model-Based Reinforcement Learning

no code implementations • 7 Feb 2022 • Richard S. Sutton, Marlos C. Machado, G. Zacharias Holland, David Szepesvari, Finbarr Timbers, Brian Tanner, Adam White

Each subtask is solved to produce an option, and then a model of the option is learned and made available to the planning process.

Model-based Reinforcement Learning reinforcement-learning +1

Paper
Add Code

Temporal Abstraction in Reinforcement Learning with the Successor Representation

no code implementations • 12 Oct 2021 • Marlos C. Machado, Andre Barreto, Doina Precup, Michael Bowling

In this paper, we argue that the successor representation (SR), which encodes states based on the pattern of state visitation that follows them, can be seen as a natural substrate for the discovery and use of temporal abstractions.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

On Bonus-Based Exploration Methods in the Arcade Learning Environment

no code implementations • 22 Sep 2021 • Adrien Ali Taïga, William Fedus, Marlos C. Machado, Aaron Courville, Marc G. Bellemare

Research on exploration in reinforcement learning, as applied to Atari 2600 game-playing, has emphasized tackling difficult exploration problems such as Montezuma's Revenge (Bellemare et al., 2016).

Montezuma's Revenge

Paper
Add Code

A general class of surrogate functions for stable and efficient reinforcement learning

1 code implementation • 12 Aug 2021 • Sharan Vaswani, Olivier Bachem, Simone Totaro, Robert Mueller, Shivam Garg, Matthieu Geist, Marlos C. Machado, Pablo Samuel Castro, Nicolas Le Roux

Common policy gradient methods rely on the maximization of a sequence of surrogate functions.

Policy Gradient Methods reinforcement-learning +1

Paper
Code

Exploration-Driven Representation Learning in Reinforcement Learning

no code implementations • ICML Workshop URL 2021 • Akram Erraqabi, Mingde Zhao, Marlos C. Machado, Yoshua Bengio, Sainbayar Sukhbaatar, Ludovic Denoyer, Alessandro Lazaric

In this work, we introduce a method that explicitly couples representation learning with exploration when the agent is not provided with a uniform prior over the state space.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Add Code

Contrastive Behavioral Similarity Embeddings for Generalization in Reinforcement Learning

1 code implementation • ICLR 2021 • Rishabh Agarwal, Marlos C. Machado, Pablo Samuel Castro, Marc G. Bellemare

Specifically, we introduce a theoretically motivated policy similarity metric (PSM) for measuring behavioral similarity between states.

reinforcement-learning Reinforcement Learning (RL) +1

32,805

Paper
Code

Beyond variance reduction: Understanding the true impact of baselines on policy optimization

no code implementations • 31 Aug 2020 • Wesley Chung, Valentin Thomas, Marlos C. Machado, Nicolas Le Roux

Traditionally, stochastic optimization theory predicts that learning dynamics are governed by the curvature of the loss function and the noise of the gradient estimates.

Reinforcement Learning (RL) Stochastic Optimization

Paper
Add Code

An operator view of policy gradient methods

no code implementations • NeurIPS 2020 • Dibya Ghosh, Marlos C. Machado, Nicolas Le Roux

We cast policy gradient methods as the repeated application of two operators: a policy improvement operator $\mathcal{I}$, which maps any policy $\pi$ to a better one $\mathcal{I}\pi$, and a projection operator $\mathcal{P}$, which finds the best approximation of $\mathcal{I}\pi$ in the set of realizable policies.

Policy Gradient Methods

Paper
Add Code

Exploration in Reinforcement Learning with Deep Covering Options

no code implementations • ICLR 2020 • Yuu Jinnai, Jee Won Park, Marlos C. Machado, George Konidaris

While many option discovery methods have been proposed to accelerate exploration in reinforcement learning, they are often heuristic.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

On Bonus Based Exploration Methods In The Arcade Learning Environment

no code implementations • ICLR 2020 • Adrien Ali Taiga, William Fedus, Marlos C. Machado, Aaron Courville, Marc G. Bellemare

Research on exploration in reinforcement learning, as applied to Atari 2600 game-playing, has emphasized tackling difficult exploration problems such as Montezuma's Revenge (Bellemare et al., 2016).

Montezuma's Revenge

Paper
Add Code

Benchmarking Bonus-Based Exploration Methods on the Arcade Learning Environment

no code implementations • 6 Aug 2019 • Adrien Ali Taïga, William Fedus, Marlos C. Machado, Aaron Courville, Marc G. Bellemare

This paper provides an empirical evaluation of recently developed exploration algorithms within the Arcade Learning Environment (ALE).

Benchmarking Montezuma's Revenge

Paper
Add Code

Generalization and Regularization in DQN

1 code implementation • 29 Sep 2018 • Jesse Farebrother, Marlos C. Machado, Michael Bowling

Deep reinforcement learning algorithms have shown an impressive ability to learn complex control policies in high-dimensional tasks.

Atari Games Benchmarking +2

Paper
Code

Count-Based Exploration with the Successor Representation

2 code implementations • ICLR 2019 • Marlos C. Machado, Marc G. Bellemare, Michael Bowling

In this paper we introduce a simple approach for exploration in reinforcement learning (RL) that allows us to develop theoretically justified algorithms in the tabular case but that is also extendable to settings where function approximation is required.

Ranked #16 on Atari Games on Atari 2600 Venture

Atari Games Efficient Exploration +1

Paper
Code

Accelerating Learning in Constructive Predictive Frameworks with the Successor Representation

no code implementations • 23 Mar 2018 • Craig Sherstan, Marlos C. Machado, Patrick M. Pilarski

As a primary contribution of this work, we show that using SR-based predictions can improve sample efficiency and learning speed in a continual learning setting where new predictions are incrementally added and learned over time.

Continual Learning Reinforcement Learning (RL)

Paper
Add Code

The Eigenoption-Critic Framework

no code implementations • 11 Dec 2017 • Miao Liu, Marlos C. Machado, Gerald Tesauro, Murray Campbell

Eigenoptions (EOs) have been recently introduced as a promising idea for generating a diverse set of options through the graph Laplacian, having been shown to allow efficient exploration.

Efficient Exploration Hierarchical Reinforcement Learning +1

Paper
Add Code

Eigenoption Discovery through the Deep Successor Representation

1 code implementation • ICLR 2018 • Marlos C. Machado, Clemens Rosenbaum, Xiaoxiao Guo, Miao Liu, Gerald Tesauro, Murray Campbell

Options in reinforcement learning allow agents to hierarchically decompose a task into subtasks, having the potential to speed up learning and planning.

Atari Games reinforcement-learning +2

Paper
Code

Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents

7 code implementations • 18 Sep 2017 • Marlos C. Machado, Marc G. Bellemare, Erik Talvitie, Joel Veness, Matthew Hausknecht, Michael Bowling

The Arcade Learning Environment (ALE) is an evaluation platform that poses the challenge of building AI agents with general competency across dozens of Atari 2600 games.

Atari Games

2,070

Paper
Code

A Laplacian Framework for Option Discovery in Reinforcement Learning

1 code implementation • ICML 2017 • Marlos C. Machado, Marc G. Bellemare, Michael Bowling

Representation learning and option discovery are two of the biggest challenges in reinforcement learning (RL).

Atari Games reinforcement-learning +2

Paper
Code

Introspective Agents: Confidence Measures for General Value Functions

no code implementations • 17 Jun 2016 • Craig Sherstan, Adam White, Marlos C. Machado, Patrick M. Pilarski

Agents of general intelligence deployed in real-world scenarios must adapt to ever-changing environmental conditions.

Position

Paper
Add Code

Learning Purposeful Behaviour in the Absence of Rewards

no code implementations • 25 May 2016 • Marlos C. Machado, Michael Bowling

In the reinforcement learning framework, goals are encoded as reward functions that guide agent behaviour, and the sum of observed rewards provide a notion of progress.

Paper
Add Code

True Online Temporal-Difference Learning

1 code implementation • 13 Dec 2015 • Harm van Seijen, A. Rupam Mahmood, Patrick M. Pilarski, Marlos C. Machado, Richard S. Sutton

Our results suggest that the true online methods indeed dominate the regular methods.

Atari Games

Paper
Code

State of the Art Control of Atari Games Using Shallow Reinforcement Learning

1 code implementation • 4 Dec 2015 • Yitao Liang, Marlos C. Machado, Erik Talvitie, Michael Bowling

The recently introduced Deep Q-Networks (DQN) algorithm has gained attention as one of the first successful combinations of deep neural networks and reinforcement learning.

Atari Games reinforcement-learning +1

Paper
Code

Domain-Independent Optimistic Initialization for Reinforcement Learning

no code implementations • 16 Oct 2014 • Marlos C. Machado, Sriram Srinivasan, Michael Bowling

In Reinforcement Learning (RL), it is common to use optimistic initialization of value functions to encourage exploration.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

A Methodology for Player Modeling based on Machine Learning

no code implementations • 13 Dec 2013 • Marlos C. Machado

We also presented a generic approach to deal with player modeling using ML, and we instantiated this approach to model players' preferences in the game Civilization IV.

BIG-bench Machine Learning Binary Classification

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.