Deep Convolutional Poses for Human Interaction Recognition in Monocular Videos
Human interaction recognition is a challenging problem in computer vision and has been researched over the years due to its important applications. With the development of deep models for the human pose estimation problem, this work aims to verify the effectiveness of using the human pose in order to recognize the human interaction in monocular videos. This paper developed a method based on 5 steps: detect each person in the scene, track them, retrieve the human pose, extract features based on the pose and finally recognize the interaction using a classifier. The Two-Person interaction dataset was used for the development of this methodology. Using a whole sequence evaluation approach it achieved 87.56 91.10 to recognize the interaction. The methodology developed in this paper shows that an RGB camera can be as effective as depth cameras to recognize the interaction between two persons using the recent development of deep models to estimate the human pose.
READ FULL TEXT