Search Results for author: Eiji Uchibe

Found 10 papers, 0 papers with code

Randomized-to-Canonical Model Predictive Control for Real-world Visual Robotic Manipulation

no code implementations • 5 Jul 2022 • Tomoya Yamanokuchi, Yuhwan Kwon, Yoshihisa Tsurumine, Eiji Uchibe, Jun Morimoto, Takamitsu Matsubara

However, such works are limited to one-shot transfer, where real-world data must be collected once to perform the sim-to-real transfer, which remains a significant human effort in transferring the models learned in simulations to new domains in the real world.

Model Predictive Control

Paper
Add Code

Model-Based Imitation Learning Using Entropy Regularization of Model and Policy

no code implementations • 21 Jun 2022 • Eiji Uchibe

We derive structured discriminators so that the learning of the policy and the model is efficient.

counterfactual Imitation Learning +3

Paper
Add Code

Enforcing KL Regularization in General Tsallis Entropy Reinforcement Learning via Advantage Learning

no code implementations • 16 May 2022 • Lingwei Zhu, Zheng Chen, Eiji Uchibe, Takamitsu Matsubara

Maximum Tsallis entropy (MTE) framework in reinforcement learning has gained popularity recently by virtue of its flexible modeling choices including the widely used Shannon entropy and sparse entropy.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

$q$-Munchausen Reinforcement Learning

no code implementations • 16 May 2022 • Lingwei Zhu, Zheng Chen, Eiji Uchibe, Takamitsu Matsubara

The recently successful Munchausen Reinforcement Learning (M-RL) features implicit Kullback-Leibler (KL) regularization by augmenting the reward function with logarithm of the current stochastic policy.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Forward and inverse reinforcement learning sharing network weights and hyperparameters

no code implementations • 17 Aug 2020 • Eiji Uchibe, Kenji Doya

A forward RL step minimizes the reverse KL estimated by the inverse RL step.

Imitation Learning reinforcement-learning +1

Paper
Add Code

Unbounded Output Networks for Classification

no code implementations • 25 Jul 2018 • Stefan Elfwing, Eiji Uchibe, Kenji Doya

In this study, by adopting features of the EE-RBM approach to feed-forward neural networks, we propose the UnBounded output network (UBnet) which is characterized by three features: (1) unbounded output units; (2) the target value of correct classification is set to a value much greater than one; and (3) the models are trained by a modified mean-squared error objective.

Classification General Classification

Paper
Add Code

Unifying Value Iteration, Advantage Learning, and Dynamic Policy Programming

no code implementations • 30 Oct 2017 • Tadashi Kozuno, Eiji Uchibe, Kenji Doya

Approximate dynamic programming algorithms, such as approximate value iteration, have been successfully applied to many complex reinforcement learning tasks, and a better approximate dynamic programming algorithm is expected to further extend the applicability of reinforcement learning to various tasks.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Online Meta-learning by Parallel Algorithm Competition

no code implementations • 24 Feb 2017 • Stefan Elfwing, Eiji Uchibe, Kenji Doya

In the OMPAC method, several instances of a reinforcement learning algorithm are run in parallel with small differences in the initial values of the meta-parameters.

Atari Games Meta-Learning +3

Paper
Add Code

Sigmoid-Weighted Linear Units for Neural Network Function Approximation in Reinforcement Learning

no code implementations • 10 Feb 2017 • Stefan Elfwing, Eiji Uchibe, Kenji Doya

First, we propose two activation functions for neural network function approximation in reinforcement learning: the sigmoid-weighted linear unit (SiLU) and its derivative function (dSiLU).

Atari Games reinforcement-learning +1

Paper
Add Code

A Generalized Natural Actor-Critic Algorithm

no code implementations • NeurIPS 2009 • Tetsuro Morimura, Eiji Uchibe, Junichiro Yoshimoto, Kenji Doya

In this paper, we describe a generalized Natural Gradient (gNG) by linearly interpolating the two FIMs and propose an efficient implementation for the gNG learning based on a theory of the estimating function, generalized Natural Actor-Critic (gNAC).

Reinforcement Learning (RL)

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.