Vision analysis of human motion in static image has long been one of the significant research topics in computer vision.Deep learning,represented by convolutional neural network,has been much concerned by academic and business field in recent years and achieved dramatic breakthroughs in computer vision.We focus on exploring human action detection and pose estimation with convolutional network.Action detection is to detect persons in the image and predict their action labels at the same time.Human pose estimation is to locate human joints and connect them orderly to get human body structure.Recognizing human actions can be done with or without knowing the person’s ground truth location.Most related works assume that the person’s precise location is given and treat action recognition as a classification problem in a default actions set.However,such assumption is not reasonable in reality.Action detection is to recognize human actions without relying on person’s ground truth location,which is more challenging.We utilize two state-of-the-art detection algorithms based on convolutional neural network,Faster RCNN and SSD to perform action detection,evaluating their performance on such a fine-grained task.In addition,we make comparison with existing action detection work,proving that our action detection model performs much better.Human pose estimation can be performed when input image has single person or multiple ones.We address multiple human pose estimation problem.Based on human parts detection,we propose a bottom-up method.We develop a multiple-branch Faster RCNN model to detect human parts as well as persons in the image based on Faster RCNN.After the parts and persons location boxes are obtained,we set a series of simple but practical rules to locate joints and match person with parts.Experiments on several datasets demonstrate that our multiple-branch model outperforms the original Faster RCNN on person and parts detection task.Our method achieves comparable or better performance compared with other pose estimation algorithms,meanwhile has much faster speed. |