Font Size: a A A

Research On Human Behaviors Representation And Recognition Based On Multi-feature

Posted on:2024-04-03Degree:MasterType:Thesis
Country:ChinaCandidate:Y T ChenFull Text:PDF
GTID:2568307100989199Subject:Electronic information
Abstract/Summary:PDF Full Text Request
Human action recognition(HAR)has attracted the attention of researchers at domestic and international because of its extensive practicability.With the rapid development of deep learning technology,integrating deep learning for HAR has become one of the current mainstream methods currently.With the continuous popularity of video acquisition equipment,the HAR based on monocular camera has made remarkable achievements.However,the interference of background,illumination and other factors,HAR have bottlenecks in the degree of accuracy.Although researchers have proposed a series of methods such as background subtraction and filtering,it is difficult to give balance to both efficiency and accuracy.In addition,the number of human action training samples cannot meet the requirements of deep learning technology with the variety of human action,the difficulty of extraction,and the high cost of calibration.To overcome the above problems,this thesis proposed a multi-feature and multi-scale human action recognition model based on depth video,and carries out a series of studies.The main work and contributions are summarized as follows:(1)As the HAR based on monocular camera is difficult to break through the bottleneck of background,illumination,occlusion,we extract the human skeleton motion information through depth video and realizes the strong association between the monitoring data and the actor.The human action representation sequence is constructed based on the spatial information of human joint points.Based on the prior knowledge of human physiological structure and kinematics,a 3D floating point matrix for human behavior representation is designed,which integrates multiple features.(2)In order to solve the problem of overfitting caused by insufficient samples in the HAR using deep learning technology,a data enhancement strategy based on human physiological structure and kinematics is proposed.First of all,according to the similar basic structure of the human skeleton but the size proportion is different,the human skeleton scaling strategy is used to enhancement the data;Secondly,some continuous time frames are set as blank time frames to simulate action fragments to expand the data and further improve the robustness of the model;Thirdly,based on the action sequence,insert an similar frame to make the behavior data more consistent at the time scale;Finally,combine the enhanced data to form the final action sample.(3)Aiming at the time series multi-scale problem caused by the inconsistency of human action in time dimension,a multi-scale HAR is proposed.First of all,the multiscale adjustment of human action samples is realized by integrating the advantages of the three proposed scale normalization strategies;Secondly,the multi-scale transformation of the neural network model is required by using the input of multiscale human action data.Therefore,the multi-scale transformation based on the convolution neural network is realized by introducing the spatial pyramid pooling layer(SPP)and the global average pooling layer(GAP),combining the fine-grained characteristics of the spatial pyramid pooling layer and the advantages of the global average pooling layer;Finally,the multi-scale transformation of human action samples and neural network model are combined to finally realize multi-scale human action recognition.
Keywords/Search Tags:action recognition, depth video, data augmentation, spatial-temporal feature representation, multi-scale learning
PDF Full Text Request
Related items