| With the rapid development of machine vision,pose estimation has gradually become an important part of many computer vision fields.At present,human pose estimation has achieved great success,while animal pose estimation is rare.There are still the following problems.First,there is a lack of data sets for animal pose estimation.Second,when there is a problem with the camera angle and the animal’s limb occlusion,it is difficult for the network to extract the key point information of this part,resulting in inaccurate estimation results.Third,some pose estimation models have large parameters but high computational complexity due to the deep network.Because of the above problems,this paper studies terrestrial animal pose estimation based on deep learning,and mainly completes the following work:(1)Because the existing animal pose estimation technology has the dilemma of lacking sufficient data sets that can analyze the key characteristics of animals,this study proposes a method of combining synthetic animal data sets with real animal data sets to construct a network model to obtain more accurate prediction results and save training costs.At the same time,the improved stacked hourglass network model introduces the SE attention mechanism and proposes a multi-scale maximum pooling module based on the SE attention mechanism(MMPM-S).This method performs four different scales of maximum pooling operations on the input features to improve the model’s ability to obtain global information;at the same time,the model in this chapter also introduces CBAM attention and proposes an improved hourglass module based on CBAM attention mechanism(IHNM-C),which can effectively improve the accuracy of the network with the least parameter cost.The horse is trained,verified and tested on the Tig Dog dataset.The experimental data show that the correct estimation percentage of the key points of the method PCK@0.05 reaches 73.35%.(2)Aiming at the problem of extracting difficult key points of animals,this paper first proposes a feature fusion method of a stacked hourglass network.After image preprocessing,the input original image is used as the input of each hourglass module to obtain more accurate feature information of animal key points.At the same time,this paper proposes the feature fusion of the hourglass module Hourglass,which transmits the input of a single hourglass module to the outermost layer,which can effectively prevent the model from losing a large amount of deep and shallow feature information,thereby improving the detection accuracy.The data set of Tig Dog animal horses is trained,verified,and tested on the network.The experimental data show that the correct estimation percentage of the key points of the method PCK@0.05 reaches74.01%.(3)Aiming at the problems of increasing network parameters and increasing computational complexity of the improved stacked hourglass network model,an improved conditional channel weighting based on ECA(ECA-ICCW)is proposed to replace the 3×3 convolution in the residual module,so as to lightweight the network and bring obvious performance gain to the model.Secondly,a residual module ICCW-Bottle(Bottleneck based on lightweight unit ECA-ICCW)is proposed.The Mish activation function is used to replace the Relu activation function of the original residual module,which further improves the ability of network optimization and generalization model while lightening the model.Aiming at the pose estimation of large-scale target animals,a dual-branch feature fusion method is proposed.The image is preprocessed and stitched into the last hourglass network through deep separable convolution,which fully integrates the high-bottom feature information and effectively improves the detection accuracy of difficult key points of animals.Training,verification and testing are carried out on the data set of Tig Dog animal horse.The experimental results show that the estimation accuracy of this method has been improved to 74.63%,and the complexity and other indicators have been greatly reduced. |