no code implementations • 2 Apr 2024 • Faraz Lotfi, Farnoosh Faraji, Nikhil Kakodkar, Travis Manderson, David Meger, Gregory Dudek
This paper explores leveraging large language models for map-free off-road navigation using generative AI, reducing the need for traditional data collection and annotation.
no code implementations • 30 Nov 2023 • Jean-François Tremblay, David Meger, Francois Hogan, Gregory Dudek
These robots will need to sense these properties through interaction prior to performing downstream tasks with the objects.
no code implementations • 15 Nov 2023 • Wei-Di Chang, Francois Hogan, David Meger, Gregory Dudek
In this paper we leverage self-supervised vision transformer models and their emergent semantic abilities to improve the generalization abilities of imitation learning policies.
no code implementations • 2 Oct 2023 • Wei-Di Chang, Scott Fujimoto, David Meger, Gregory Dudek
Imitation Learning from Observation (ILfO) is a setting in which a learner tries to imitate the behavior of an expert, using only observational data and without the direct guidance of demonstrated actions.
no code implementations • 8 Sep 2023 • Zhizun Wang, David Meger
In this paper, we propose a novel model-based multi-agent reinforcement learning approach named Value Decomposition Framework with Disentangled World Model to address the challenge of achieving a common goal of multiple agents interacting in the same environment with reduced sample complexity.
no code implementations • 25 Aug 2023 • Lucas Berry, David Meger
This work introduces an efficient novel approach for epistemic uncertainty estimation for ensemble models for regression tasks using pairwise-distance estimators (PaiDEs).
2 code implementations • NeurIPS 2023 • Scott Fujimoto, Wei-Di Chang, Edward J. Smith, Shixiang Shane Gu, Doina Precup, David Meger
In the field of reinforcement learning (RL), representation learning is a proven tool for complex image-based tasks, but is often overlooked for environments with low-level states, such as physical control problems.
2 code implementations • 9 May 2023 • Prakash Panangaden, Sahand Rezaei-Shoshtari, Rosie Zhao, David Meger, Doina Precup
Our policy gradient results allow for leveraging approximate symmetries of the environment for policy optimization.
1 code implementation • 2 Feb 2023 • Lucas Berry, David Meger
In this work, we demonstrate how to reliably estimate epistemic uncertainty while maintaining the flexibility needed to capture complicated aleatoric distributions.
no code implementations • 28 Nov 2022 • Sahand Rezaei-Shoshtari, Charlotte Morissette, Francois Robert Hogan, Gregory Dudek, David Meger
In this paper, hypernetworks are trained to generate behaviors across a range of unseen task conditions, via a novel TD-based training objective and data from a set of near-optimal RL solutions for training tasks.
no code implementations • 14 Nov 2022 • Amir Rasouli, Randy Goebel, Matthew E. Taylor, Iuliia Kotseruba, Soheil Alizadeh, Tianpei Yang, Montgomery Alban, Florian Shkurti, Yuzheng Zhuang, Adam Scibior, Kasra Rezaee, Animesh Garg, David Meger, Jun Luo, Liam Paull, Weinan Zhang, Xinyu Wang, Xi Chen
The proposed competition supports methodologically diverse solutions, such as reinforcement learning (RL) and offline learning methods, trained on a combination of naturalistic AD data and open-source simulation platform SMARTS.
1 code implementation • 3 Oct 2022 • Edward J. Smith, Michal Drozdzal, Derek Nowrouzezahrai, David Meger, Adriana Romero-Soriano
We evaluate our proposed approach on the ABC dataset and the in the wild CO3D dataset, and show that: (1) we are able to obtain high quality state-of-the-art occupancy reconstructions; (2) our perspective conditioned uncertainty definition is effective to drive improvements in next best view selection and outperforms strong baseline approaches; and (3) we can further improve shape understanding by performing a gradient-based search on the view selection candidates.
no code implementations • 1 Oct 2022 • Fengdi Che, Xiru Zhu, Doina Precup, David Meger, Gregory Dudek
Guided exploration with expert demonstrations improves data efficiency for reinforcement learning, but current algorithms often overuse expert information.
1 code implementation • 15 Sep 2022 • Sahand Rezaei-Shoshtari, Rosie Zhao, Prakash Panangaden, David Meger, Doina Precup
Abstraction has been widely studied as a way to improve the efficiency and generalization of reinforcement learning algorithms.
no code implementations • 24 May 2022 • Harley Wiltzer, David Meger, Marc G. Bellemare
We demonstrate the effectiveness of such an algorithm in a synthetic control problem.
no code implementations • 19 May 2022 • Wei-Di Chang, Juan Camilo Gamboa Higuera, Scott Fujimoto, David Meger, Gregory Dudek
We present an algorithm for Inverse Reinforcement Learning (IRL) from expert state observations only.
no code implementations • 28 Jan 2022 • Scott Fujimoto, David Meger, Doina Precup, Ofir Nachum, Shixiang Shane Gu
In this work, we study the use of the Bellman equation as a surrogate objective for value prediction accuracy.
no code implementations • 9 Dec 2021 • Stefan Wapnick, Travis Manderson, David Meger, Gregory Dudek
We present a reward-predictive, model-based deep learning method featuring trajectory-constrained visual attention for local planning in visual navigation tasks.
no code implementations • 29 Sep 2021 • Di wu, Tianyu Li, David Meger, Michael Jenkin, Xue Liu, Gregory Dudek
Unfortunately, most online reinforcement learning algorithms require a large number of interactions with the environment to learn a reliable control policy.
no code implementations • 29 Sep 2021 • Scott Fujimoto, David Meger, Doina Precup, Ofir Nachum, Shixiang Shane Gu
In this work, we analyze the effectiveness of the Bellman equation as a proxy objective for value prediction accuracy in off-policy evaluation.
no code implementations • 29 Sep 2021 • Melissa Mozifian, Dieter Fox, David Meger, Fabio Ramos, Animesh Garg
In this paper, we consider the problem of continuous control for various robot manipulation tasks with an explicit representation that promotes skill reuse while learning multiple tasks, related through the reward function.
2 code implementations • NeurIPS 2021 • Edward J. Smith, David Meger, Luis Pineda, Roberto Calandra, Jitendra Malik, Adriana Romero, Michal Drozdzal
In this paper, we focus on this problem and introduce a system composed of: 1) a haptic simulator leveraging high spatial resolution vision-based tactile sensors for active touching of 3D objects; 2)a mesh-based 3D shape reconstruction model that relies on tactile or visuotactile signals; and 3) a set of data-driven solutions with either tactile or visuotactile priors to guide the shape exploration.
1 code implementation • 12 Jun 2021 • Scott Fujimoto, David Meger, Doina Precup
We bridge the gap between MIS and deep reinforcement learning by observing that the density ratio can be computed from the successor representation of the target policy.
1 code implementation • 12 Jan 2021 • Sahand Rezaei-Shoshtari, Francois Robert Hogan, Michael Jenkin, David Meger, Gregory Dudek
Predicting the future interaction of objects when they come into contact with their environment is key for autonomous agents to take intelligent and anticipatory actions.
no code implementations • 1 Jan 2021 • Scott Fujimoto, David Meger, Doina Precup
We bridge the gap between MIS and deep reinforcement learning by observing that the density ratio can be computed from the successor representation of the target policy.
1 code implementation • 3 Dec 2020 • Melissa Mozifian, Amy Zhang, Joelle Pineau, David Meger
The goal of this work is to address the recent success of domain randomization and data augmentation for the sim2real setting.
1 code implementation • 22 Jul 2020 • Sahand Rezaei-Shoshtari, David Meger, Inna Sharf
Utilization of latent space to capture a lower-dimensional representation of a complex dynamics model is explored in this work.
1 code implementation • NeurIPS 2020 • Scott Fujimoto, David Meger, Doina Precup
Prioritized Experience Replay (PER) is a deep reinforcement learning technique in which agents learn from transitions sampled with non-uniform probability proportionate to their temporal-difference error.
1 code implementation • NeurIPS 2020 • Edward J. Smith, Roberto Calandra, Adriana Romero, Georgia Gkioxari, David Meger, Jitendra Malik, Michal Drozdzal
When a toddler is presented a new toy, their instinctual behaviour is to pick it upand inspect it with their hand and eyes in tandem, clearly searching over its surface to properly understand what they are playing with.
no code implementations • 9 Apr 2020 • Travis Manderson, Stefan Wapnick, David Meger, Gregory Dudek
We present a method for learning to drive on smooth terrain while simultaneously avoiding collisions in challenging off-road and unstructured outdoor environments using only visual inputs.
no code implementations • 2 Dec 2019 • Xiru Zhu, Fengdi Che, Tianzi Yang, Tzuyang Yu, David Meger, Gregory Dudek
This is because the task of evaluating the quality of a generated image differs from deciding if an image is real or fake.
1 code implementation • 23 Oct 2019 • Sanjay Thakur, Herke van Hoof, Gunshi Gupta, David Meger
PAC Bayes is a generalized framework which is more resistant to overfitting and that yields performance bounds that hold with arbitrarily high probability even on the unjustified extrapolations.
no code implementations • 14 Oct 2019 • Caleb Hoyne, S. Karthik Mukkavilli, David Meger
Reanalysis datasets combining numerical physics models and limited observations to generate a synthesised estimate of variables in an Earth system, are prone to biases against ground truth.
no code implementations • 5 Oct 2019 • Sahand Rezaei-Shoshtari, David Meger, Inna Sharf
Motivated by the recursive Newton-Euler formulation, we propose a novel cascaded Gaussian process learning framework for the inverse dynamics of robot manipulators.
no code implementations • 2 Jun 2019 • Melissa Mozifian, Juan Camilo Gamboa Higuera, David Meger, Gregory Dudek
We explore the use of gradient-based search methods to learn a domain randomization with the following properties: 1) The trained policy should be successful in environments sampled from the domain randomization distribution 2) The domain randomization distribution should be wide enough so that the experience similar to the target robot system is observed during training, while addressing the practicality of training finite capacity models.
no code implementations • 18 Apr 2019 • Yi Tian Xu, Yaqiao Li, David Meger
Inspired by ideas in cognitive science, we propose a novel and general approach to solve human motion understanding via pattern completion on a learned latent representation space.
1 code implementation • 13 Mar 2019 • Sanjay Thakur, Herke van Hoof, Juan Camilo Gamboa Higuera, Doina Precup, David Meger
Learned controllers such as neural networks typically do not have a notion of uncertainty that allows to diagnose an offset between training and testing conditions, and potentially intervene.
1 code implementation • 31 Jan 2019 • Edward J. Smith, Scott Fujimoto, Adriana Romero, David Meger
Mesh models are a promising approach for encoding the structure of 3D objects.
Ranked #1 on 3D Object Reconstruction on Data3D−R2N2 (Avg F1 metric)
10 code implementations • 7 Dec 2018 • Scott Fujimoto, David Meger, Doina Precup
Many practical applications of reinforcement learning constrain agents to learn from a fixed batch of data which has already been gathered, without offering further possibility for data collection.
no code implementations • 27 Sep 2018 • Scott Fujimoto, David Meger, Doina Precup
This work examines batch reinforcement learning--the task of maximally exploiting a given batch of off-policy data, without further data collection.
3 code implementations • 6 Mar 2018 • Juan Camilo Gamboa Higuera, David Meger, Gregory Dudek
Finally, we assess the performance of the algorithm for learning motor controllers for a six legged autonomous underwater vehicle.
Model-based Reinforcement Learning reinforcement-learning +1
3 code implementations • NeurIPS 2018 • Edward Smith, Scott Fujimoto, David Meger
We consider the problem of scaling deep generative shape models to high-resolution.
Ranked #2 on 3D Object Reconstruction on Data3D−R2N2 (Avg F1 metric)
67 code implementations • ICML 2018 • Scott Fujimoto, Herke van Hoof, David Meger
In value-based reinforcement learning methods such as deep Q-learning, function approximation errors are known to lead to overestimated value estimates and suboptimal policies.
Ranked #2 on Continuous Control on Lunar Lander (OpenAI Gym)
1 code implementation • 6 Dec 2017 • Peter Henderson, Thang Doan, Riashat Islam, David Meger
Policy gradient methods have had great success in solving continuous control tasks, yet the stochastic nature of such problems makes deterministic value estimation difficult.
1 code implementation • 21 Sep 2017 • Peter Henderson, Matthew Vertescher, David Meger, Mark Coates
To allay this problem, we use a meta-learning process -- cost adaptation -- which generates the optimization objective for D-RHC to solve based on a set of human-generated priors (cost and constraint functions) and an auxiliary heuristic.
1 code implementation • 20 Sep 2017 • Peter Henderson, Wei-Di Chang, Pierre-Luc Bacon, David Meger, Joelle Pineau, Doina Precup
Inverse reinforcement learning offers a useful paradigm to learn the underlying reward function directly from expert demonstrations.
4 code implementations • 19 Sep 2017 • Peter Henderson, Riashat Islam, Philip Bachman, Joelle Pineau, Doina Precup, David Meger
In recent years, significant progress has been made in solving challenging problems across various domains using deep reinforcement learning (RL).
1 code implementation • 14 Aug 2017 • Peter Henderson, Wei-Di Chang, Florian Shkurti, Johanna Hansen, David Meger, Gregory Dudek
As demand drives systems to generalize to various domains and problems, the study of multitask, transfer and lifelong learning has become an increasingly important pursuit.
3 code implementations • 29 Jul 2017 • Edward Smith, David Meger
This paper describes a new approach for training generative adversarial networks (GAN) to understand the detailed 3D shape of objects.