Search Results for author: Taisuke Kobayashi

Found 24 papers, 2 papers with code

Revisiting Experience Replayable Conditions

no code implementations15 Feb 2024 Taisuke Kobayashi

As a result, it is confirmed through numerical simulations that the proposed stabilization tricks make ER applicable to an advantage actor-critic, an on-policy algorithm.

Metric Learning

Intentionally-underestimated Value Function at Terminal State for Temporal-difference Learning with Mis-designed Reward

no code implementations24 Aug 2023 Taisuke Kobayashi

Although this problem can be avoided by paying attention to the reward design, it is essential in practical use of TD learning to review the exception handling at termination.

Soft Actor-Critic Algorithm with Truly-satisfied Inequality Constraint

no code implementations8 Mar 2023 Taisuke Kobayashi

However, the priority of maximizing the policy entropy is automatically tuned in the current implementation, the rule of which can be interpreted as one for equality constraint, binding the policy entropy into its specified lower bound.

Reward Bonuses with Gain Scheduling Inspired by Iterative Deepening Search

no code implementations21 Dec 2022 Taisuke Kobayashi

This paper introduces a novel method of adding intrinsic bonuses to task-oriented reward function in order to efficiently facilitate reinforcement learning search.

Scheduling

Sparse Representation Learning with Modified q-VAE towards Minimal Realization of World Model

no code implementations8 Aug 2022 Taisuke Kobayashi, Ryoma Watanuki

We experimentally verified the benefits of the sparsification by the proposed method that it can easily find the necessary and sufficient six dimensions for a reaching task with a mobile manipulator that requires a six-dimensional state space.

Representation Learning

Proximal Policy Optimization with Adaptive Threshold for Symmetric Relative Density Ratio

no code implementations18 Mar 2022 Taisuke Kobayashi

However, the density ratio is asymmetric for its center, and the possible error scale from its center, which should be close to the threshold, would depend on how the baseline policy is given.

Consolidated Adaptive T-soft Update for Deep Reinforcement Learning

no code implementations25 Feb 2022 Taisuke Kobayashi

Recently, T-soft update has been proposed as a noise-robust update rule for the target network and has contributed to improving the DRL performance.

reinforcement-learning Reinforcement Learning (RL)

AdaTerm: Adaptive T-Distribution Estimated Robust Moments for Noise-Robust Stochastic Gradient Optimization

1 code implementation18 Jan 2022 Wendyam Eric Lionel Ilboudo, Taisuke Kobayashi, Takamitsu Matsubara

In this paper, we propose AdaTerm, a novel approach that incorporates the Student's t-distribution to derive not only the first-order moment but also all the associated statistics.

Towards Autonomous Driving of Personal Mobility with Small and Noisy Dataset using Tsallis-statistics-based Behavioral Cloning

no code implementations29 Nov 2021 Taisuke Kobayashi, Takahito Enomoto

On the other hand, the concept of personal mobility is also getting popular, and its autonomous driving specialized for individual drivers is expected for a new step.

Autonomous Driving

Impact of GPU uncertainty on the training of predictive deep neural networks

no code implementations3 Sep 2021 Maciej Pietrowski, Andrzej Gajda, Takuto Yamamoto, Taisuke Kobayashi, Lana Sinapayen, Eiji Watanabe

GPU-specific computational processing is more indeterminate than that by CPUs, and hardware-derived uncertainties, which are often considered obstacles that need to be eliminated, might, in some cases, be successfully incorporated into the training of deep neural networks.

Adaptive t-Momentum-based Optimization for Unknown Ratio of Outliers in Amateur Data in Imitation Learning

no code implementations2 Aug 2021 Wendyam Eric Lionel Ilboudo, Taisuke Kobayashi, Kenji Sugimoto

In order to allow the imitators to effectively learn from imperfect demonstrations, we propose to employ the robust t-momentum optimization algorithm.

Imitation Learning

Motion Illusion-like Patterns Extracted from Photo and Art Images Using Predictive Deep Neural Networks

no code implementations23 Jun 2021 Taisuke Kobayashi, Akiyoshi Kitaoka, Manabu Kosaka, Kenta Tanaka, Eiji Watanabe

In our previous study, we successfully reproduced the illusory motion of the rotating snakes illusion using deep neural networks incorporating predictive coding theory.

Artificial Perception Meets Psychophysics, Revealing a Fundamental Law of Illusory Motion

no code implementations18 Jun 2021 Taisuke Kobayashi, Eiji Watanabe

Rotating Snakes is a visual illusion in which a stationary design is perceived to move dramatically.

Optimistic Reinforcement Learning by Forward Kullback-Leibler Divergence Optimization

no code implementations27 May 2021 Taisuke Kobayashi

This paper addresses a new interpretation of the traditional optimization method in reinforcement learning (RL) as optimization problems using reverse Kullback-Leibler (KL) divergence, and derives a new optimization method using forward KL divergence, instead of reverse KL divergence in the optimization problems.

reinforcement-learning Reinforcement Learning (RL)

Optimization Algorithm for Feedback and Feedforward Policies towards Robot Control Robust to Sensing Failures

no code implementations1 Apr 2021 Taisuke Kobayashi, Kenta Yoshizawa

To alleviate this drawback of the FB controllers, feedback error learning integrates one of them with a feedforward (FF) controller.

Reinforcement Learning (RL)

Deep unfolding-based output feedback control design for linear systems with input saturation

no code implementations20 Nov 2020 Koki Kobayashi, Masaki Ogura, Taisuke Kobayashi, Kenji Sugimoto

In this paper, we propose a deep unfolding-based framework for the output feedback control of systems with input saturation.

Proximal Policy Optimization with Relative Pearson Divergence

no code implementations7 Oct 2020 Taisuke Kobayashi

PPO clips density ratio of the latest and baseline policies with a threshold, while its minimization target is unclear.

t-Soft Update of Target Network for Deep Reinforcement Learning

no code implementations25 Aug 2020 Taisuke Kobayashi, Wendyam Eric Lionel Ilboudo

The problem with its conventional update rule is the fact that all the parameters are smoothly copied with the same speed from the main network, even when some of them are trying to update toward the wrong directions.

reinforcement-learning Reinforcement Learning (RL)

Adaptive and Multiple Time-scale Eligibility Traces for Online Deep Reinforcement Learning

no code implementations23 Aug 2020 Taisuke Kobayashi

The eligibility traces method is well known as an online learning technique for improving sample efficiency in traditional reinforcement learning with linear regressors rather than DRL.

reinforcement-learning Reinforcement Learning (RL)

Towards Deep Robot Learning with Optimizer applicable to Non-stationary Problems

no code implementations31 Jul 2020 Taisuke Kobayashi

In the real-world data, noise and outliers cannot be excluded from dataset to be used for learning robot skills.

q-VAE for Disentangled Representation Learning and Latent Dynamical Systems

no code implementations4 Mar 2020 Taisuke Kobayashi

In the proposed method, a standard VAE is employed to statistically extract latent space hidden in sampled data, and this latent space helps make robots controllable in feasible computational time and cost.

Representation Learning

TAdam: A Robust Stochastic Gradient Optimizer

3 code implementations29 Feb 2020 Wendyam Eric Lionel Ilboudo, Taisuke Kobayashi, Kenji Sugimoto

Machine learning algorithms aim to find patterns from observations, which may include some noise, especially in robotics domain.

Reinforcement Learning (RL)

Cannot find the paper you are looking for? You can Submit a new open access paper.