| In recent years,Convolutional Neural Networks(CNN)has made rapid development in the field of image.The human keypoint detection algorithm based on CNN has become an important ways to solve the problem of human keypoint detection.Human keypoint detection plays an important role in intelligent monitoring,human-computer interaction and other fields.On the one hand,due to the particularity of human keypoint detection at this stage,the network can not fully integrate high-level and low-level semantic features and adaptively adjust the weight of useful information in the feature matrix.Moreover,existing algorithms stack a large number of convolution networks to build a complex structure.Although the accuracy can be significantly improved,it will take up a lot of resources.On the other hand,algorithms deployed on embedded devices need to solve the problem that real-time performance is difficult to guarantee during the operation of limited computing power.Embedded deployment is also an important step in the commercialization of algorithms.Therefore,the specific research on the above issues in this thesis is as follows:(1)In view of the shortcomings of the existing network structure that cannot fully combine high and low-level semantic features,inspired by the spatial pyramid structure and the self-learning module,this thesis proposes a new Joint Self-Learning Attribute Pyramid Module(JSLAPM).This module is universal and can be applied to any open source backbone network structure.On the one hand,It is composed of Multidimensional Self-Learning Module(MSLM)and Feature Pyramid Modul(FPM).The former adjusts the feature matrix by learning the importance of feature matrix in spatial dimension and channel dimension,while the latter enhances the expression ability of feature attributes by fusing feature matrices of different depths.On the other hand,the mutual influence between the forecasts in the forecasting process leads to a low detection rate.In order to solve this problem,this thesis proposes a Channel Separation Extraction Module(CSEM).The accuracy of the network is improved by about 3%.(2)There is a long time-consuming problem in the execution of the network model.Inspired by the codec structure in the pixel cutting network architecture,this thesis designs an efficient lightweight backbone network,which uses deep separable convolution as the basic convolution module.MSLM is used to learn feature matrix adaptively to enhance the weight of useful information,while using the codec structure to optimize the efficiency of its backbone structure.Finally,this thesis designs a deep separable network called Codec Depth Separable Network(CSDNet),which compared with MobileNet,the performance is improved by 72%and the accuracy is improved by 5.2%.(3)Considering embedded devices with limited computing resources,this article deploys Raspberry Pi in experiments,and studies the most convenient way to compile and install Tensorflow on Raspberry Pi and execute the lightweight network structure proposed in this thesis.This thesis deploys the designed lightweight network CSDNet on the Raspberry Pi and tests the human keypoint detection algorithm.At the same time,the accuracy and speed are compared to prove the effectiveness and real-time performance of the lightweight network. |