In this paper, a hierarchical reinforcement learning algorithm for controlling nonholonomic systems is proposed. When applying reinforcement learning to the nonholonomic systems, acquiring adequate policies is difficult because of an increase of learning steps and a convergence of local optimal policies. The proposed algorithm is inspired by such human learning behavior. Human can learn to control such systems sufficiently even if they initially have little knowledge about the system's dynamics and the way to control. This human capability is suggested to be caused by their exploration strategies for acquiring the adequate policies. The key element of the proposed algorithm is a shaping function defined on a novel position-direction space. The shaping function is autonomously constructed once the goal is reached and constrains the exploration area to optimize the policy. The efficiency of the proposed shaping function was demonstrated by using a nonholonomic control problem of positioning the 2-link planer underactuated manipulator.