| Human pose recognition technology has become an important research content in the field of computer vision,and the application scene has penetrated into every aspect of daily life.Human pose recognition refers to the recognition of key parts and major joints in digital image or video,which is the basis and premise of human motion recognition and behavior analysis.Human pose recognition technology is developing faster and faster,but it also faces many challenges.For example,lighting conditions,complex backgrounds,human occlusion and occlusion between objects,etc.,hinder the progress of pose recognition research.Human pose recognition based on deep learning has a high degree of abstraction and universality due to its stacked convolutional neural network for extracting large data sets.Compared with traditional image-based methods,it has better generalization and robustness,it has a good performance in dealing with problems such as poor lighting conditions,complex backgrounds,and occlusion of human body parts,but there are also problems facing practical applications such as complex model structures,relatively redundant modules,and real-time improvements.Therefore,achieving high-precision and real-time gesture recognition in natural environments still faces great challenges.Aiming at the problem of human pose recognition in 2D planar images and videos,this dissertation proposes an improved OpenPose model,which better optimizes the relationship between accuracy and real-time performance,fully utilizes the advantages of the residual network as the backbone network,and passes by observing the laws of joint point motion,the joint point recognition results are simply applied to the scene of human motion classification,which solves the problem of human pose recognition in natural images and human motion classification in natural videos.The main research contents are summarized as follows:(1)A human pose recognition method based on the fusion of front-layer information is proposed,which is improved on the basis of OpenPose model to solve the problem that the real-time performance of human pose recognition in natural images needs to be improved.The OpenPose basic model has the shortcomings of large model parameters and difficult training.The main reason is that it is affected by the large number of network parameters of the backbone network VGG19.This method aims at the problem that the real-time performance of the OpenPose basic model needs to be improved,and combines the characteristics of the residual network that is easy to back-propagate,and proposes a method of using the ResNet network as the backbone network and reducing the number of stages while using a small convolution kernel instead of a large convolution kernel.Reduce the amount of parameters and calculation,and improve the real-time performance in attitude estimation.First,using the ResNet residual structure instead of the traditional VGG structure can improve the efficiency of backpropagation and speed up the model's convergence speed.In addition,in the network part of the specific joint point recognition,the last two stages are cut,and the large convolution kernels in the remaining stages are replaced by cascaded small convolution kernels,which effectively reduces the amount of calculation and improves training time.Efficiency and real-time in forecasting.The experimental results on the COCO dataset show that compared with other methods,this method has a certain accuracy guarantee in solving pose recognition problems.At the same time,by comparing the prediction time of the original model,this method also improves the real-time problem.(2)A human motion recognition method based on discriminant factor classification is proposed,and the scenes of standing,walking,and approaching are discriminated according to different values of the discriminant factors,which are used to implement simple human motion classification tasks.Human motion classification methods based on deep learning technology often have many classification situations and high accuracy,but also have shortcomings,such as complex models and troublesome deployment.This method proposes a classification criterion that uses empirical values to set actions.After acquiring human joint point information,a multi-target tracking process is performed first to obtain the joint point trajectory of each target,and then a behavior template is manually designed for action matching based on the joint point information.A simple application from human pose recognition to human action classification has been realized. |