Font Size: a A A

Research On Human Posture Estimation Based On High-resolution Deep Neural Networ

Posted on:2024-06-18Degree:MasterType:Thesis
Country:ChinaCandidate:L LiFull Text:PDF
GTID:2568307130972549Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
Human pose estimation is the process of detecting and labeling the joint points of a human body after detecting the target in a given image or a video frame,and then connecting the key points.It is the focus of current computer vision research.With the development of artificial intelligence,it also plays an important role in human motion recognition,human-computer interaction and other tasks.The research of human body posture estimation has been continuously mature with the development of deep learning and neural network.However,due to the complexity of human body,the angle of camera,the complexity of environment and the dissimilarity of posture,the human body in the image often has problems such as different body scales,mutual occlusion and background overlap,which makes it a challenging task to improve the detection accuracy of human body posture estimation;At the same time,the convolution neural network has a large number of parameters and high computational complexity,which makes the operation time of the network too long.In view of the above problems,this paper mainly focuses on the research of improving the detection accuracy of the network and designing the lightweight of the network based on the high-resolution network.The main research contents are as follows:(1)In view of the problems of predicting the correct posture of the human body when facing the challenge of the scale change of the human body in the human body pose estimation network,the high resolution network(HRNet)is optimized and designed,and a multi-scale attention mechanism based high resolution network model(MSANet)is proposed.This method mainly integrates the multi-scale processing capabilities of pyramid convolution and attention feature fusion to reconstruct the basic module of high-resolution network.The optimized network is tested on the COCO data set.The experimental results show that the improved basic module improves the average estimation accuracy of the high-resolution network by 2.5%.(2)For the detection of human key points in complex environments,such as occlusion,overlap and other key points,the precise detection,location and classification of difficult key points can be achieved by enhancing the use of semantic information of high-level features and fine-grained features of low-level features.Firstly,the improved non-local auto-converter module is used to enhance the global spatial feature before the multi-resolution fusion,so that the network can extract more spatial feature information of Rongge in the multi-resolution fusion stage;Then,in the final stage,the features of each layer are fused using the adaptive spatial feature fusion strategy,and the fused features are extracted again with sufficient spatial features and semantic information from spatial attention and channel attention through the attitude adjustment machine to achieve more accurate positioning of difficult key points;Finally,the training test,ablation and visual analysis experiments are carried out on COCO data set and MPII data set.Compared with the original network,the improved network has higher sensitivity and anti-interference ability to detect difficult key points.(3)In response to the problem of large high-resolution networks that are difficult to use on mobile devices and embedded platforms,a lightweight design is proposed for the network.Based on the Micro Blocks module of Micro Net,a lightweight basic module is proposed to reduce the number of parameters and computational complexity of the network.This module reduces the connectivity between feature nodes through micro decomposition convolution to avoid the reduction of network width,and improves the nonlinearity through dynamic activation function to compensate for the performance degradation caused by the reduction of network depth.The experimental results show that compared with the original network,the parameter quantity is reduced from 28.5M to 17.24 M,a reduction of 39.5%,and the computational complexity is reduced from 7.10 G to 6.02 G,a reduction of 15.2%.
Keywords/Search Tags:Human posture estimation, High resolution network, Multi-scale, Attention mechanism, Obscure, Lightweight
PDF Full Text Request
Related items