Font Size: a A A

Research On Two-dimensional Human Pose Estimation Algorithm Based On Deep Learning

Posted on:2023-08-21Degree:MasterType:Thesis
Country:ChinaCandidate:X M ZhaoFull Text:PDF
GTID:2558307154975469Subject:Engineering
Abstract/Summary:PDF Full Text Request
Human pose estimation aims to localize the positions of human body parts from given images or videos.It is one of the most fundamental components in various downstream visual tasks,such as human pose tracking,action recognition,and person re-identification.With the development of deep learning,methods based on deep convolutional neural networks are gradually introduced into this field.It can not only learn and extract different features directly from the input but also can easily model the relationship between joints,which greatly improves the accuracy of detection.This paper is also based on deep neural networks,redesigning the network from different perspectives,such as feature fusion,spatial mutual information capture,and model efficiency,to alleviate the ambiguities of joint prediction and the balance between model performance and efficiency.The specific works are summarized as follows:(1)A multi-scale feature fusion mechanism based on gating is proposed.Almost all state-of-the-art pose estimation methods combine multi-level features by directly adding features at each position.It would introduce noisy information,especially in complex situations(crowded scenes,occlusions,and unnormal poses).This paper proposed a gated multi-scale feature fusion module.Firstly,a gate is generated by adaptively assigning weights for each level feature in a data-driven manner.Secondly,using the gate to control the transmission of the information at different levels.That effectively suppresses the introduction of noisy information,thereby reducing the ambiguities in pose estimation.(2)A fine-tuning strategy based on spatial mutual information is proposed.Under the guidance of the prior knowledge that the position information of human joints can refer to each other,many previous works have achieved better results by stacking a single network module to refine the initial prediction.Inspired by this,this paper proposed a spatial mutual information complementary module from the perspective of the acquisition method of word vector correlation in the field of natural language processing.It can assist the model in better adjusting the current joint’s position by capturing the information contained in other joints,which can correct the misdetection and missed detection in time.That achieves higher accuracy and introduces only a limited amount of calculation.(3)An efficient human pose estimation network is proposed.From the perspective of the receptive field,this paper explored the appropriate receptive field size required for the pose estimation task by analyzing the performance of high-resolution networks with different numbers of paths and proposed an efficient human pose estimation network based on this.Experimental results show that the network can retain the generalization ability of the original network while reducing the amount of calculation and parameters.
Keywords/Search Tags:Human pose estimation, Deep learning, Feature fusion, Spatial mutual information, Model efficiency
PDF Full Text Request
Related items