Search Results for author: Taisuke Kobayashi

Found 24 papers, 2 papers with code

Revisiting Experience Replayable Conditions

no code implementations • 15 Feb 2024 • Taisuke Kobayashi

As a result, it is confirmed through numerical simulations that the proposed stabilization tricks make ER applicable to an advantage actor-critic, an on-policy algorithm.

Metric Learning

Paper
Add Code

Intentionally-underestimated Value Function at Terminal State for Temporal-difference Learning with Mis-designed Reward

no code implementations • 24 Aug 2023 • Taisuke Kobayashi

Although this problem can be avoided by paying attention to the reward design, it is essential in practical use of TD learning to review the exception handling at termination.

Paper
Add Code

Soft Actor-Critic Algorithm with Truly-satisfied Inequality Constraint

no code implementations • 8 Mar 2023 • Taisuke Kobayashi

However, the priority of maximizing the policy entropy is automatically tuned in the current implementation, the rule of which can be interpreted as one for equality constraint, binding the policy entropy into its specified lower bound.

Paper
Add Code

Reward Bonuses with Gain Scheduling Inspired by Iterative Deepening Search

no code implementations • 21 Dec 2022 • Taisuke Kobayashi

This paper introduces a novel method of adding intrinsic bonuses to task-oriented reward function in order to efficiently facilitate reinforcement learning search.

Scheduling

Paper
Add Code

Sparse Representation Learning with Modified q-VAE towards Minimal Realization of World Model

no code implementations • 8 Aug 2022 • Taisuke Kobayashi, Ryoma Watanuki

We experimentally verified the benefits of the sparsification by the proposed method that it can easily find the necessary and sufficient six dimensions for a reaching task with a mobile manipulator that requires a six-dimensional state space.

Representation Learning

Paper
Add Code

Proximal Policy Optimization with Adaptive Threshold for Symmetric Relative Density Ratio

no code implementations • 18 Mar 2022 • Taisuke Kobayashi

However, the density ratio is asymmetric for its center, and the possible error scale from its center, which should be close to the threshold, would depend on how the baseline policy is given.

Paper
Add Code

Consolidated Adaptive T-soft Update for Deep Reinforcement Learning

no code implementations • 25 Feb 2022 • Taisuke Kobayashi

Recently, T-soft update has been proposed as a noise-robust update rule for the target network and has contributed to improving the DRL performance.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

L2C2: Locally Lipschitz Continuous Constraint towards Stable and Smooth Reinforcement Learning

no code implementations • 15 Feb 2022 • Taisuke Kobayashi

RL is known for the instability of the learning process and the sensitivity of the acquired policy to noise.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

AdaTerm: Adaptive T-Distribution Estimated Robust Moments for Noise-Robust Stochastic Gradient Optimization

1 code implementation • 18 Jan 2022 • Wendyam Eric Lionel Ilboudo, Taisuke Kobayashi, Takamitsu Matsubara

In this paper, we propose AdaTerm, a novel approach that incorporates the Student's t-distribution to derive not only the first-order moment but also all the associated statistics.

Paper
Code

Towards Autonomous Driving of Personal Mobility with Small and Noisy Dataset using Tsallis-statistics-based Behavioral Cloning

no code implementations • 29 Nov 2021 • Taisuke Kobayashi, Takahito Enomoto

On the other hand, the concept of personal mobility is also getting popular, and its autonomous driving specialized for individual drivers is expected for a new step.

Autonomous Driving

Paper
Add Code

Impact of GPU uncertainty on the training of predictive deep neural networks

no code implementations • 3 Sep 2021 • Maciej Pietrowski, Andrzej Gajda, Takuto Yamamoto, Taisuke Kobayashi, Lana Sinapayen, Eiji Watanabe

GPU-specific computational processing is more indeterminate than that by CPUs, and hardware-derived uncertainties, which are often considered obstacles that need to be eliminated, might, in some cases, be successfully incorporated into the training of deep neural networks.

Paper
Add Code

Adaptive t-Momentum-based Optimization for Unknown Ratio of Outliers in Amateur Data in Imitation Learning

no code implementations • 2 Aug 2021 • Wendyam Eric Lionel Ilboudo, Taisuke Kobayashi, Kenji Sugimoto

In order to allow the imitators to effectively learn from imperfect demonstrations, we propose to employ the robust t-momentum optimization algorithm.

Imitation Learning

Paper
Add Code

Motion Illusion-like Patterns Extracted from Photo and Art Images Using Predictive Deep Neural Networks

no code implementations • 23 Jun 2021 • Taisuke Kobayashi, Akiyoshi Kitaoka, Manabu Kosaka, Kenta Tanaka, Eiji Watanabe

In our previous study, we successfully reproduced the illusory motion of the rotating snakes illusion using deep neural networks incorporating predictive coding theory.

Paper
Add Code

Artificial Perception Meets Psychophysics, Revealing a Fundamental Law of Illusory Motion

no code implementations • 18 Jun 2021 • Taisuke Kobayashi, Eiji Watanabe

Rotating Snakes is a visual illusion in which a stationary design is perceived to move dramatically.

Paper
Add Code

Hyperbolically-Discounted Reinforcement Learning on Reward-Punishment Framework

no code implementations • 3 Jun 2021 • Taisuke Kobayashi

This paper proposes a new reinforcement learning with hyperbolic discounting.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Optimistic Reinforcement Learning by Forward Kullback-Leibler Divergence Optimization

no code implementations • 27 May 2021 • Taisuke Kobayashi

This paper addresses a new interpretation of the traditional optimization method in reinforcement learning (RL) as optimization problems using reverse Kullback-Leibler (KL) divergence, and derives a new optimization method using forward KL divergence, instead of reverse KL divergence in the optimization problems.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Optimization Algorithm for Feedback and Feedforward Policies towards Robot Control Robust to Sensing Failures

no code implementations • 1 Apr 2021 • Taisuke Kobayashi, Kenta Yoshizawa

To alleviate this drawback of the FB controllers, feedback error learning integrates one of them with a feedforward (FF) controller.

Reinforcement Learning (RL)

Paper
Add Code

Deep unfolding-based output feedback control design for linear systems with input saturation

no code implementations • 20 Nov 2020 • Koki Kobayashi, Masaki Ogura, Taisuke Kobayashi, Kenji Sugimoto

In this paper, we propose a deep unfolding-based framework for the output feedback control of systems with input saturation.

Paper
Add Code

Proximal Policy Optimization with Relative Pearson Divergence

no code implementations • 7 Oct 2020 • Taisuke Kobayashi

PPO clips density ratio of the latest and baseline policies with a threshold, while its minimization target is unclear.

Paper
Add Code

t-Soft Update of Target Network for Deep Reinforcement Learning

no code implementations • 25 Aug 2020 • Taisuke Kobayashi, Wendyam Eric Lionel Ilboudo

The problem with its conventional update rule is the fact that all the parameters are smoothly copied with the same speed from the main network, even when some of them are trying to update toward the wrong directions.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Adaptive and Multiple Time-scale Eligibility Traces for Online Deep Reinforcement Learning

no code implementations • 23 Aug 2020 • Taisuke Kobayashi

The eligibility traces method is well known as an online learning technique for improving sample efficiency in traditional reinforcement learning with linear regressors rather than DRL.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Towards Deep Robot Learning with Optimizer applicable to Non-stationary Problems

no code implementations • 31 Jul 2020 • Taisuke Kobayashi

In the real-world data, noise and outliers cannot be excluded from dataset to be used for learning robot skills.

Paper
Add Code

q-VAE for Disentangled Representation Learning and Latent Dynamical Systems

no code implementations • 4 Mar 2020 • Taisuke Kobayashi

In the proposed method, a standard VAE is employed to statistically extract latent space hidden in sampled data, and this latent space helps make robots controllable in feasible computational time and cost.

Representation Learning

Paper
Add Code

TAdam: A Robust Stochastic Gradient Optimizer

3 code implementations • 29 Feb 2020 • Wendyam Eric Lionel Ilboudo, Taisuke Kobayashi, Kenji Sugimoto

Machine learning algorithms aim to find patterns from observations, which may include some noise, especially in robotics domain.

Reinforcement Learning (RL)

302

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.