Search Results for author: Andrew Patterson

Found 17 papers, 5 papers with code

When is Offline Policy Selection Sample Efficient for Reinforcement Learning?

no code implementations • 4 Dec 2023 • Vincent Liu, Prabhat Nagarajan, Andrew Patterson, Martha White

As a result, no OPS method can be more sample efficient than OPE in the worst case.

Paper
Add Code

Empirical Design in Reinforcement Learning

no code implementations • 3 Apr 2023 • Andrew Patterson, Samuel Neumann, Martha White, Adam White

The objective of this document is to provide answers on how we can use our unprecedented compute to do good science in reinforcement learning, as well as stay alert to potential pitfalls in our empirical design.

reinforcement-learning

Paper
Add Code

Robust Losses for Learning Value Functions

no code implementations • 17 May 2022 • Andrew Patterson, Victor Liao, Martha White

We start from a formalization of robust losses, then derive sound gradient-based approaches to minimize these losses in both the online off-policy prediction and control settings.

Paper
Add Code

A Temporal-Difference Approach to Policy Gradient Estimation

1 code implementation • 4 Feb 2022 • Samuele Tosatto, Andrew Patterson, Martha White, A. Rupam Mahmood

The policy gradient theorem (Sutton et al., 2000) prescribes the usage of a cumulative discounted state distribution under the target policy to approximate the gradient.

Paper
Code

A Generalized Projected Bellman Error for Off-policy Value Estimation in Reinforcement Learning

no code implementations • 28 Apr 2021 • Andrew Patterson, Adam White, Martha White

Many algorithms have been developed for off-policy value estimation based on the linear mean squared projected Bellman error (MSPBE) and are sound under linear function approximation.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Contraction $\mathcal{L}_1$-Adaptive Control using Gaussian Processes

no code implementations • 8 Sep 2020 • Aditya Gahlawat, Arun Lakshmanan, Lin Song, Andrew Patterson, Zhuohuan Wu, Naira Hovakimyan, Evangelos Theodorou

We present $\mathcal{CL}_1$-$\mathcal{GP}$, a control framework that enables safe simultaneous learning and control for systems subject to uncertainties.

Gaussian Processes

Paper
Add Code

Gradient Temporal-Difference Learning with Regularized Corrections

1 code implementation • ICML 2020 • Sina Ghiassian, Andrew Patterson, Shivam Garg, Dhawal Gupta, Adam White, Martha White

It is still common to use Q-learning and temporal difference (TD) learning-even though they have divergence issues and sound Gradient TD alternatives exist-because divergence seems rare and they typically perform well.

Q-Learning

Paper
Code

L1-GP: L1 Adaptive Control with Bayesian Learning

no code implementations • L4DC 2020 • Aditya Gahlawat, Pan Zhao, Andrew Patterson, Naira Hovakimyan, Evangelos Theodorou

We present L1-GP, an architecture based on L1 adaptive control and Gaussian Process Regression (GPR) for safe simultaneous control and learning.

GPR regression

Paper
Add Code

Learning Probabilistic Intersection Traffic Models for Trajectory Prediction

no code implementations • 5 Feb 2020 • Andrew Patterson, Aditya Gahlawat, Naira Hovakimyan

The safety of these agents is dependent on their ability to predict collisions with other vehicles' future trajectories for replanning and collision avoidance.

Collision Avoidance Object Recognition +2

Paper
Add Code

Learning Macroscopic Brain Connectomes via Group-Sparse Factorization

1 code implementation • NeurIPS 2019 • Farzane Aminmansour, Andrew Patterson, Lei Le, Yisu Peng, Daniel Mitchell, Franco Pestilli, Cesar F. Caiafa, Russell Greiner, Martha White

We develop an efficient optimization strategy for this extremely high-dimensional sparse problem, by reducing the number of parameters using a greedy algorithm designed specifically for the problem.

Paper
Code

Intent-Aware Probabilistic Trajectory Estimation for Collision Prediction with Uncertainty Quantification

no code implementations • 4 Apr 2019 • Andrew Patterson, Arun Lakshmanan, Naira Hovakimyan

We show that the uncertainty region for obstacle positions can be expressed in terms of a combination of polynomials generated with Gaussian process regression.

Uncertainty Quantification

Paper
Add Code

Proximity Queries for Absolutely Continuous Parametric Curves

3 code implementations • 13 Feb 2019 • Arun Lakshmanan, Andrew Patterson, Venanzio Cichella, Naira Hovakimyan

In motion planning problems for autonomous robots, such as self-driving cars, the robot must ensure that its planned path is not in close proximity to obstacles in the environment.

Robotics Computational Geometry Graphics

Paper
Code

Supervised autoencoders: Improving generalization performance with unsupervised regularizers

1 code implementation • NeurIPS 2018 • Lei Le, Andrew Patterson, Martha White

A common strategy to improve generalization has been through the use of regularizers, typically as a norm constraining the parameters.

Paper
Code

Online Off-policy Prediction

no code implementations • 6 Nov 2018 • Sina Ghiassian, Andrew Patterson, Martha White, Richard S. Sutton, Adam White

The ability to learn behavior-contingent predictions online and off-policy has long been advocated as a key capability of predictive-knowledge learning systems but remained an open algorithmic challenge for decades.

Paper
Add Code

General Value Function Networks

no code implementations • 18 Jul 2018 • Matthew Schlegel, Andrew Jacobsen, Zaheer Abbas, Andrew Patterson, Adam White, Martha White

A general purpose strategy for state construction is to learn the state update using a Recurrent Neural Network (RNN), which updates the internal state using the current internal state and the most recent observation.

Continuous Control Decision Making

Paper
Add Code

Organizing Experience: A Deeper Look at Replay Mechanisms for Sample-based Planning in Continuous State Domains

no code implementations • 12 Jun 2018 • Yangchen Pan, Muhammad Zaheer, Adam White, Andrew Patterson, Martha White

We show that a model, as opposed to a replay buffer, is particularly useful for specifying which states to sample from during planning, such as predecessor states that propagate information in reverse from a state more quickly.

Paper
Add Code

Discovery of Predictive Representations With a Network of General Value Functions

no code implementations • ICLR 2018 • Matthew Schlegel, Andrew Patterson, Adam White, Martha White

We investigate a framework for discovery: curating a large collection of predictions, which are used to construct the agent's representation of the world.

Decision Making

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.