Font Size: a A A

Research On Human Action Recognition Based On Skeleton Features

Posted on:2024-07-26Degree:MasterType:Thesis
Country:ChinaCandidate:S Y ChenFull Text:PDF
GTID:2568306917497474Subject:Electronic information
Abstract/Summary:PDF Full Text Request
Human action recognition is an important research hotspot in the field of computer vision,which is widely used in video surveillance and human-computer interaction.With the development of human pose estimation recently,the study in human action recognition based on skeleton features has achieved promising results.Nevertheless,the practical applications of this research remain challenging.For example,it is difficult for large-scale neural networks to be deployed in devices with low hashrate,while lightweight models and hidden markov model(HMM)with few parameters can satisfy the real-time application requirement in action recognition.In addition,in the scenes with serious object occlusion and high recognition accuracy requirements,convolutional neural network(CNN)has become the mainstream method of human pose estimation and human action recognition because of its powerful learning ability.However,the simply use of CNN in processing human skeleton may ignore the natural limb structure,while semantic graph convolutional network(SemGCN)can obtain the semantic information in the skeleton data according to the human limb structure.According to the analysis above,this thesis studies human action recognition based on skeleton features,focusing on human pose estimation and skeleton-based action recognition.The main contents are as follows:(1)A top-down human pose estimator is designed,which includes a backbone,an initial pose estimation module and a pose correction module.The backbone is applied to generate feature maps from the original image,and then the initial pose estimation module is employed to roughly locate human keypoint position.The pose correction module is a SemGCN-based coordinate regression network,which obtains the refined pose by processing the implicit body structure and visual features.The online hard keypoint mining(OHKM)algorithm is used to balance the training weights of easy keypoints and hard keypoints during the estimator training.Experiments and comparisons are conducted on the public datasets to demonstrate the effectiveness and accuracy of the proposed estimator.(2)A skeleton-based action recognition model based on spatio-temporal features is built,which contains spatial feature extraction module,atomic action classification module and action classification module.Because of the limited equipment hashrate and other factors,the thesis apply the lightweight OpenPose for extracting the human skeleton of each image in the video,and then the spatial feature sequence of human action is obtained by calculating the angle features and the keypoint distance.After classifying the atomic actions based on the spatial feature sequence,HMM is used to learn the features of atomic actions changing with time,and classify the human actions in the video.The experimental results indicate that the proposed model demonstrates a high degree of accuracy in human action recognition.(3)A skeleton-based action recognition model based on two-stream semantic graph convolution is constructed,which consists of a skeleton sequence preprocessing module,a spatial feature extraction network and a temporal feature extraction network.The standard skeleton data is obtained by dealing with the human keypoint annotation information,and the motion information of the skeleton is calculated from the data.According to the information of the natural body structure,two identical spatial graph convolutional networks(SGCN)are employed to extract the local keypoint features in the standard skeleton data and motion information.Then,the temporal feature extraction network is exploited to capture the relationship of the keypoints changing with time.In the end,the spatio-temporal features extracted by the two branches are fused for action classification.Experiment results verify that the proposed model can effectively learn features of human action and obtain remarkable recognition results.
Keywords/Search Tags:Semantic graph convolutional network, Hidden markov model, Spatio-temporal feature, Human action recognition, Human pose estimation
PDF Full Text Request
Related items