English version /Japanese version
Updated: 2006/5/16

Research activities

Current researches: Former researches:

Optimizing Resolution for Feature Extraction in Robotic Motion Learning

This paper presents a feature extraction method for robotic motion learning that optimizes image resolution to the task, thereby minimizing computation time. It utilizes mean-shift algorithms and principal component analysis for feature extraction, reinforcement learning for motion learning, and trial and error for finding the appropriate resolution. When applied to a manipulator pushing an object, the resolution adjustment method reduces the task time from one minute to 21 seconds.

Proposed architecture for feature extraction from image inputs

Experimental setup with camera and manipulator for object pushing task

Flow of feature extraction using image information

Pocessed images at each time step with different resolution

Publications:

  1. Masato Kato, Yuichi Kobayashi and Shigeyuki Hosoe, ``Optimizing Resolution for Feature Extraction in Robotic Motion Learning ,'' IEEE Int. Conf. on Systems, Man & Cybernetics, Hawaii USA, 1086-1091, 2005.

Reinforcement learning for object manipulation using low-dimensional mapping

This paper proposes a reinforcement learning method for dynamic control problems with holonomic constraints. The learning method is applicable to problems where the actual motion of the system is restricted to lower-dimensional submanifolds, so long as certain conditions are satisfied. Such dynamic control problems occur in robotic manipulation, which usually includes some holonomic constraints between the object and the robot or the environment. By introducing nonlinear mapping to one-dimensional space and approximating the boundary of a discontinuous reward function, the proposed method results in effective learning. The method is evaluated in a one degree of freedom object rotating task with contact force considerations. The effectiveness of the proposed learning method was verified by comparison to ordinal Q-learning and Dyna without the proposed mapping method.

Constrained motion of manipulation by robot hand

Submanifold generated by constrained motion in configuration space

Object rotation task with cosnraint to keep contact between hand and object

An example of reward profile

Publications:

  1. Yuichi Kobayashi, Hiroki Fujii and Shigeyuki Hosoe, ``Reinforcement learning for object manipulation using low-dimensional mapping,'' Transactions of the Society of Instrument and Control Engineers, Vol.42, No.7, 2006.
  2. Yuichi Kobayashi, Hiroki Fujii and Shigeyuki Hosoe, ``Reinforcement Learning for Manipulation Using Constraint between Object and Robot,'' IEEE Int. Conf. on Systems, Man & Cybernetics, Hawaii USA., 871-876, 2005.

Hyper-cubic function approximation for reinforcement learning based on autonomous-decentralized algorithm

Adaptive resolution of function approximator is known to be important when we apply reinforcement learning to unknown problems. We propose to apply successive division and integration scheme of function approximation to Temporal Difference learning based on local curvature. TD learning in continuous state space is based on non-constant value function approximation, which requires the simplicity of function approximator representation. We define bases and local complexity of function approximator in the similar way to the autonomous decentralized function approximation, but they are much simpler. The simplicity of approximator element bring us much less computation and easier analysis. The proposed function approximator is proved to be effective through function approximation problem and a reinforcement learning common problem, pendulum swing-up task and acrobot stabilizing task.

Comparison of learning performance among RBF network, fixed approximation and proposed adaptive resolution approximation

Performance of control obtained by adaptive resolution function approximation

An example of adaptive resolution in pendulum swing-up application

Publication:

  1. Yuichi Kobayashi, Hideo Yuasa, Shigeyuki Hosoe, ``Hyper Cubic Function Approximation for Reinforcement Learning Based on Autonomous-Decentralized Algorithm,'' Transactions of the Society of Instrument and Control Engineers, Vol. 40, No. 8, 849-858, 2004 (in Japanese)
  2. Yuichi KOBAYASHI and Shigeyuki HOSOE, ``Adaptive Resolution Function Approximation for TD-learning: Simple Division and Integration,'' Proc. of SICE Annual Conference 2003, Fukui, Japan, 3023-3028, 2003.
  3. Yuichi Kobayashi and Shigeyuki HOSOE, ``Hyper-Cubic Discretization in Reinforcement Learning Based on Autonomous Decentralized Approach,'' IEEE Int. Conf. on Systems, Man & Cybernetics, Washington D.C. USA, 3633-3638, 2003.

Function approximation for reinforcement learning using autonomous-decentralized algorithm

The adaptability of resolution to the complexity of approximated function has a great influence on the performance of learning in the function approximation for reinforcement learning. We propose applying the reactiondiffusion equation on a graph to function approximation for reinforcement learning.The function approximator expressed by nodes can change its resolution adaptively by distributing them densely in the complex region of the state space with the proposed algorithm. A function is expressed in a plane. The successive least square method is adopted to approximate the function from the data.Each plane corresponds to a node, which is an element of the graph.Each node moves to diffuse the complexity of the approximated function in the neighborhood based on the reaction-diffusion equation.The complexity of the function is defined by the change of gradient. The simulation shows the two points: 1) The proposed algorithm provides the adaptability for function approximation. 2) The function approximation improves the efficiency of the reinforcement learning.

平面の貼り合わせ
境界付きグラフへの適用
複雑度の定義
複雑度の均一化


1次元ガウス関数近似のうようす

Adaptive resolution in 1D function approximation problem

2次元ガウス関数近似のようす

Adaptive resolution in 2D function approximation problem

Pendulum swing-up task

Comparison of learning performance between adaptive method(red) and fixed structure(blue)

Publication:
  1. Yuichi Kobayashi, Hideo Yuasa, Tamio Arai, ``Function Approximation for Reinforcement Learning Using Autonomous-Decentralized Algorithm,'' Transactions of the Society of Instrument and Control Engineers, Vol. 38, No. 2, 219-226, 2002 (in Japanese)
  2. Yuichi KOBAYASHI, Hideo YUASA and Shigeyuki HOSOE, ``Q-learning with Adaptive Resolution Function Approximation based on Graph,'' Proc. of the ICASE/SICE Workshop: Intelligent Control and Systems, Muju, Korea, 79-84, 2002.
  3. Yuichi Kobayashi, Hideo Yuasa, and Tamio Arai, ``Function Approximation for Reinforcement Learning Based on Reaction-Diffusion Equation on a Graph,'' Proc. of SICE Annual Conference 2002, Osaka, Japan, 916-921, 2002.

Design of quadruped robot soccer behavior considering observational cost

In this paper, we present a real-time decision making method for a quadruped robot whose sensor and locomotion have large errors, considering the observational cost and the optimality. We make a State-Action Map by off-line planning considering the uncertainty of the robot's location with Dynamic Programming (DP). Using this map, the robot can immediately decide optimal action which minimizes the time to reach a target state at any states. The number of observation is also minimized. We compress this map for implementation with Vector Quantization (VQ). The total loss of optimality through compression is minimized by using the differences of the values between the optimal action and the others. In the simulation, the performance of some soccer behaviors were improved in comparison with current methods. The proposed method is implemented on the real robot and the low computation under the restriction of the memory was verified in the experiment.

Expression of state transition with uncertainty in state space including uncertainty parameter

An example of observation strategy with real quadruped robot

Publication:

  1. Yuichi Kobayashi, Takeshi Fukase, Ryuichi Ueda, Hideo Yuasa, Tamio Arai, ``Design of Quadruped Robot Soccer Behavior Considering Observational Cost,'' Journal of the Robotics Society of Japan, Vol. 21, No.7, 802-810 (in Japanese), 2003
  2. Takeshi Fukase, Yuichi Kobayashi, Ryuichi Ueda, Takanobu Kawabe and Tamio Arai, ``Real-time Decision Making under Uncertainty of Self-Localization Results,’’ The 2002 International RoboCup Symposium Pre-Proceedings, 372-379, 2002.
  3. Takeshi FUKASE, Masahiro YOKOI, Yuichi KOBAYASHI, Hideo YUASA and Tamio ARAI, ``Quadruped Robot Navigation Considering the Observational Cost,'' Andreas Birk, Silvia Coradeschi and Satoshi Tadokoro (Eds.), RoboCup 2001: Robot Soccer World Cup V, Springer, 350-355, 2002.

Back to TOP PAGE