Search Results for author: Andrey Kolobov

Found 19 papers, 9 papers with code

PRISE: Learning Temporal Action Abstractions as a Sequence Compression Problem

1 code implementation • 16 Feb 2024 • Ruijie Zheng, Ching-An Cheng, Hal Daumé III, Furong Huang, Andrey Kolobov

To do so, we bring a subtle but critical component of LLM training pipelines -- input tokenization via byte pair encoding (BPE) -- to the seemingly distant task of learning skills of variable time span in continuous control domains.

Continuous Control Few-Shot Imitation Learning +2

Paper
Code

WindSeer: Real-time volumetric wind prediction over complex terrain aboard a small UAV

no code implementations • 18 Jan 2024 • Florian Achermann, Thomas Stastny, Bogdan Danciu, Andrey Kolobov, Jen Jen Chung, Roland Siegwart, Nicholas Lawrance

Real-time high-resolution wind predictions are beneficial for various applications including safe manned and unmanned aviation.

valid

Paper
Add Code

LLF-Bench: Benchmark for Interactive Learning from Language Feedback

no code implementations • 11 Dec 2023 • Ching-An Cheng, Andrey Kolobov, Dipendra Misra, Allen Nie, Adith Swaminathan

We introduce a new benchmark, LLF-Bench (Learning from Language Feedback Benchmark; pronounced as "elf-bench"), to evaluate the ability of AI agents to interactively learn from natural language feedback and instructions.

Information Retrieval OpenAI Gym

Paper
Add Code

Interactive Robot Learning from Verbal Correction

no code implementations • 26 Oct 2023 • Huihan Liu, Alice Chen, Yuke Zhu, Adith Swaminathan, Andrey Kolobov, Ching-An Cheng

A key feature of OLAF is its ability to update the robot's visuomotor neural policy based on the verbal feedback to avoid repeating mistakes in the future.

Language Modelling Large Language Model

Paper
Add Code

Goal Representations for Instruction Following: A Semi-Supervised Language Interface to Control

no code implementations • 30 Jun 2023 • Vivek Myers, Andre He, Kuan Fang, Homer Walke, Philippe Hansen-Estruch, Ching-An Cheng, Mihai Jalobeanu, Andrey Kolobov, Anca Dragan, Sergey Levine

Our method achieves robust performance in the real world by learning an embedding from the labeled data that aligns language not to the goal image, but rather to the desired change between the start and goal images that the instruction corresponds to.

Instruction Following

Paper
Add Code

Improving Offline RL by Blending Heuristics

no code implementations • 1 Jun 2023 • Sinong Geng, Aldo Pacchiano, Andrey Kolobov, Ching-An Cheng

We propose Heuristic Blending (HUBL), a simple performance-improving technique for a broad class of offline RL algorithms based on value bootstrapping.

D4RL Offline RL

Paper
Add Code

PLEX: Making the Most of the Available Data for Robotic Manipulation Pretraining

no code implementations • 15 Mar 2023 • Garrett Thomas, Ching-An Cheng, Ricky Loynd, Felipe Vieira Frujeri, Vibhav Vineet, Mihai Jalobeanu, Andrey Kolobov

A rich representation is key to general robotic manipulation, but existing approaches to representation learning require large amounts of multimodal demonstrations.

Representation Learning

Paper
Add Code

MoCapAct: A Multi-Task Dataset for Simulated Humanoid Control

1 code implementation • 15 Aug 2022 • Nolan Wagener, Andrey Kolobov, Felipe Vieira Frujeri, Ricky Loynd, Ching-An Cheng, Matthew Hausknecht

We demonstrate the utility of MoCapAct by using it to train a single hierarchical policy capable of tracking the entire MoCap dataset within dm_control and show the learned low-level component can be re-used to efficiently learn downstream high-level tasks.

Humanoid Control

130

Paper
Code

The Sandbox Environment for Generalizable Agent Research (SEGAR)

1 code implementation • 19 Mar 2022 • R Devon Hjelm, Bogdan Mazoure, Florian Golemo, Felipe Frujeri, Mihai Jalobeanu, Andrey Kolobov

A broad challenge of research on generalization for sequential decision-making tasks in interactive environments is designing benchmarks that clearly landmark progress.

Decision Making

Paper
Code

Heuristic-Guided Reinforcement Learning

no code implementations • NeurIPS 2021 • Ching-An Cheng, Andrey Kolobov, Adith Swaminathan

On the theoretical side, we characterize properties of a good heuristic and its impact on RL acceleration.

Decision Making reinforcement-learning +1

Paper
Add Code

Cross-Trajectory Representation Learning for Zero-Shot Generalization in RL

1 code implementation • ICLR 2022 • Bogdan Mazoure, Ahmed M. Ahmed, Patrick MacAlpine, R Devon Hjelm, Andrey Kolobov

A highly desirable property of a reinforcement learning (RL) agent -- and a major difficulty for deep RL approaches -- is the ability to generalize policies learned on a few tasks over a high-dimensional observation space to similar tasks not seen during training.

Reinforcement Learning (RL) Representation Learning +1

Paper
Code

Measuring Sample Efficiency and Generalization in Reinforcement Learning Benchmarks: NeurIPS 2020 Procgen Benchmark

no code implementations • 29 Mar 2021 • Sharada Mohanty, Jyotish Poonganam, Adrien Gaidon, Andrey Kolobov, Blake Wulfe, Dipam Chakraborty, Gražvydas Šemetulskis, João Schapke, Jonas Kubilius, Jurgis Pašukonis, Linas Klimas, Matthew Hausknecht, Patrick MacAlpine, Quang Nhat Tran, Thomas Tumiel, Xiaocheng Tang, Xinwei Chen, Christopher Hesse, Jacob Hilton, William Hebgen Guss, Sahika Genc, John Schulman, Karl Cobbe

We present the design of a centralized benchmark for Reinforcement Learning which can help measure Sample Efficiency and Generalization in Reinforcement Learning by doing end to end evaluation of the training and rollout phases of thousands of user submitted code bases in a scalable way.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

MultiPoint: Cross-spectral registration of thermal and optical aerial imagery

1 code implementation • 2020 Conference on Robot Learning 2020 • Florian Achermann, Andrey Kolobov, Debadeepta Dey, Timo Hinzmann, Jen Jen Chung, Roland Siegwart, Nicholas Lawrance

This model is then deployed for fast and accurate online interest point detection.

Interest Point Detection

Paper
Code

Policy Improvement via Imitation of Multiple Oracles

no code implementations • NeurIPS 2020 • Ching-An Cheng, Andrey Kolobov, Alekh Agarwal

In this paper, we propose the state-wise maximum of the oracle policies' values as a natural baseline to resolve conflicting advice from multiple oracles.

Imitation Learning

Paper
Add Code

Safe Reinforcement Learning via Curriculum Induction

1 code implementation • NeurIPS 2020 • Matteo Turchetta, Andrey Kolobov, Shital Shah, Andreas Krause, Alekh Agarwal

In safety-critical applications, autonomous agents may need to learn in an environment where mistakes can be very costly.

Autonomous Driving reinforcement-learning +2

Paper
Code

Online Learning for Active Cache Synchronization

1 code implementation • ICML 2020 • Andrey Kolobov, Sébastien Bubeck, Julian Zimmert

Existing multi-armed bandit (MAB) models make two implicit assumptions: an arm generates a payoff only when it is played, and the agent observes every payoff that is generated.

Paper
Code

Staying up to Date with Online Content Changes Using Reinforcement Learning for Scheduling

1 code implementation • NeurIPS 2019 • Andrey Kolobov, Yuval Peres, Cheng Lu, Eric J. Horvitz

From traditional Web search engines to virtual assistants and Web accelerators, services that rely on online information need to continually keep track of remote content changes by explicitly requesting content updates from remote sources (e. g., web pages).

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Code

Autonomous Thermalling as a Partially Observable Markov Decision Process (Extended Version)

1 code implementation • 24 May 2018 • Iain Guilliard, Richard Rogahn, Jim Piavis, Andrey Kolobov

Small uninhabited aerial vehicles (sUAVs) commonly rely on active propulsion to stay airborne, which limits flight time and range.

Robotics Systems and Control

Paper
Code

Metareasoning for Planning Under Uncertainty

no code implementations • 3 May 2015 • Christopher H. Lin, Andrey Kolobov, Ece Kamar, Eric Horvitz

Our work subsumes previously studied special cases of metareasoning and shows that in the general case, metareasoning is at most polynomially harder than solving MDPs with any given algorithm that disregards the cost of thinking.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.