no code implementations • 6 Jul 2020 • Xinyan Yan, Byron Boots, Ching-An Cheng
Here policies are optimized by performing online learning on a sequence of loss functions that encourage the learner to mimic expert actions, and if the online learning has no regret, the agent can provably learn an expert-like policy.
no code implementations • 8 Aug 2019 • Ching-An Cheng, Xinyan Yan, Byron Boots
This can be attributed, at least in part, to the high variance in estimating the gradient of the task objective with Monte Carlo methods.
1 code implementation • 15 Oct 2018 • Ching-An Cheng, Xinyan Yan, Nathan Ratliff, Byron Boots
We present a predictor-corrector framework, called PicCoLO, that can transform a first-order model-free reinforcement or imitation learning algorithm into a new hybrid method that leverages predictive models to accelerate policy learning.
no code implementations • 12 Jun 2018 • Ching-An Cheng, Xinyan Yan, Evangelos A. Theodorou, Byron Boots
When the model oracle is learned online, these algorithms can provably accelerate the best known convergence rate up to an order.
no code implementations • 26 May 2018 • Ching-An Cheng, Xinyan Yan, Nolan Wagener, Byron Boots
We show that if the switching time is properly randomized, LOKI can learn to outperform a suboptimal expert and converge faster than running policy gradient from scratch.
no code implementations • 15 Oct 2017 • Xinyan Yan, Krzysztof Choromanski, Byron Boots, Vikas Sindhwani
Policy evaluation or value function or Q-function approximation is a key procedure in reinforcement learning (RL).
no code implementations • 21 Sep 2017 • Yunpeng Pan, Ching-An Cheng, Kamil Saigol, Keuntaek Lee, Xinyan Yan, Evangelos Theodorou, Byron Boots
We present an end-to-end imitation learning system for agile, off-road autonomous driving using only low-cost on-board sensors.
Robotics
no code implementations • ICML 2017 • Yunpeng Pan, Xinyan Yan, Evangelos A. Theodorou, Byron Boots
Sparse Spectrum Gaussian Processes (SSGPs) are a powerful tool for scaling Gaussian processes (GPs) to large datasets.
1 code implementation • 24 Jul 2017 • Mustafa Mukadam, Jing Dong, Xinyan Yan, Frank Dellaert, Byron Boots
We benchmark our algorithms against several sampling-based and trajectory optimization-based motion planning algorithms on planning problems in multiple environments.
Robotics
no code implementations • 22 Aug 2016 • Yunpeng Pan, Xinyan Yan, Evangelos Theodorou, Byron Boots
Robotic systems must be able to quickly and robustly make decisions when operating in uncertain and dynamic environments.