Detecting human-object interaction in video images is an important issue in many computer vision applications. Among various types of human-object interaction, especially the type of interaction where a person is in the middle of moving an object with his/her hand is a key to observing several critical events such as stealing luggage and abandoning suspicious substances in public spaces. This paper proposes a novel method for detecting such type of human-object interaction. In the proposed method, an area surrounding each hand is set in input video frames, and the motion distribution in every surrounding area is analyzed. Whether or not each hand moves an object is decided by whether or not its surrounding area contains regions where movements similar to those of the hand are concentrated. Since the proposed method needs not explicitly extract object regions and recognize their relations to person regions, the effectiveness in detecting the human-object interaction, technically hands which are right in the middle of moving objects, is expected to be improved for diverse situations, e.g., several persons individually move unknown objects with their hands in a scene.