Font Size: a A A

Multi-Scale Stacked Hourglass Network For Human Pose Estimation

Posted on:2020-02-19Degree:MasterType:Thesis
Country:ChinaCandidate:W L DuFull Text:PDF
GTID:2428330572461543Subject:Electronics and Communications Engineering
Abstract/Summary:PDF Full Text Request
The human pose estimation aims to detect and recognize the main joint position of the human body from the image.It is a key technology for image understanding and behavior recognition,and is also a basic problem in computer vision.Recently,due to the application of deep convolutional neural networks,the accuracy of human pose estimation has been greatly improved.However,in the case of key points being occluded,overlapping people,complex backgrounds,etc.,some wrong points will still be identified.Aiming at the above problems,this paper first proposes a multi-scale stacked hourglass network structure to enhance the differentiation ability of each hourglass network.Then a new loss function is proposed to dynamically adjust the loss in multi-scale cascade structure.The weight of the function improves the positioning accuracy.The core of human pose estimation lies in the global information of the key point type and the local information of the key point location.A reasonably effective information processing process is critical to the results.In the stacked hourglass network,simply increasing the depth of the cascaded network cannot effectively improve the accuracy.To address this problem,we propose a multi-scale collaborative network.The multi-scale pre-processing network forms feature maps of different scales and sends them to various locations of the stacked hourglass network,where small-scale features are input in front of the stacked hourglass network,and large-scale features are input in the back of the stacked hourglass network.As the enhanced method of the hourglass network,the minimum component module Inception-resnet effectively combines the Inception structure's ability to process multi-scale information and the Resnet structure to avoid gradient disappearance when the network is deepened.In this paper,through multiple sets of comparative experiments,quantitative evaluation is carried out on the MPII dataset and LSP dataset.The accuracy of the multi-scale stacked hourglass network is improved by 0.41%compared with the stacked hourglass network.The minimum module Inception-resnet optimized model is accurate.The rate has increased by 0.83%.The experiment verifies the validity and applicability of the two optimization structures in this chapter.Relay supervision is used in the cascading hourglass network to ensure the accuracy of the output of each hourglass network.However,each key point in the average weighted loss function has the same loss function weight,which is not conducive to improving the accuracy of the network as a whole.To solve this problem,we propose a dynamic weight optimization method based on Adaboost.Different key points have different loss function weight coefficients without scale,and the keypoint weight coefficients are dynamically adjusted from the top hourglass network to the bottom hourglass network.In this paper,the same network model is trained by two methods of average weighting and adaptive weighting.The experiment shows that under the two evaluation methods of PCKh@0.2 and PCP@0.5,the model of adaptive weighted optimization is accurate compared with the average weighting.The rates increased by 0.8%and 0.7%,respectively.In addition,this paper also compares the difference in the effect of the two methods on the predicted image.Quantitative and qualitative experiments verify that the adaptive weighting method can form an effective cooperation mechanism for the stacked hourglass network and improve the accuracy.
Keywords/Search Tags:human pose estimation, hourglass network, multi-scale collaboration, multi-scale linear weighting
PDF Full Text Request
Related items