Font Size: a A A

Human Action Spatial-temporal Feature Mining And Recognition Method Based On Skeleton

Posted on:2024-06-12Degree:MasterType:Thesis
Country:ChinaCandidate:Z H JianFull Text:PDF
GTID:2568307118976899Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
As one of the research hotspots in the field of artificial intelligence,human action recognition has shown great potential in many fields such as intelligent building,intelligent medical treatment,safety production,entertainment and leisure,automatic driving and military.At present,there are two main types of human action recognition methods: wearable sensor-based action recognition and video stream data based action recognition.Sensor-based human action recognition mainly relies on wearable sensors,which have the problems of cumbersome deployment and high research cost.Human action recognition based on video stream data mainly recognizes actions through RGB video captured by cameras,which has the characteristics of easy implementation,strong universality and low cost.The skeleton action recognition method based on human joint point information in video has become one of the mainstream methods due to its advantages of low noise,strong robustness and avoiding video background interference.Although skeleton-based action recognition has achieved many excellent research results,the spatio-temporal feature mining in human behavior needs more in-depth research.How to efficiently represent the spatio-temporal features of human behavior has become the key of this topic.Based on the in-depth analysis of the related research of human action recognition at home and abroad,this thesis carries out the following research for human action recognition:(1)This thesis proposes an action recognition method based on spatio-temporal tensor fusion.Human behavior is a dynamic process with spatial complexity and time difference,so it is particularly important to extract the spatio-temporal features efficiently.Based on human skeleton data,this thesis first constructs the spatiotemporal feature tensor between adjacent frames to improve the representation ability of data,and then combines cosine similarity to screen out the key spatio-temporal feature tensor of behavior.Then,based on the difference of feature distribution and time convolution,the attention mechanism of spatio-temporal feature tensor of action is constructed to enhance the representation of spatio-temporal features for actions.Finally,the deep random configuration network is used to recognize actions.(2)An adaptive behavior recognition algorithm based on multi-scale dynamic warping is proposed.This thesis deeply analyzed the shortcomings of the action recognition method based on spatio-temporal tensor fusion.Firstly,aiming at the imbalance of spatio-temporal features of adjacent frames,the spatio-temporal features of adjacent frames were reconstructed based on kinematics and human body structure.Aiming at the problem that the same similarity threshold cannot be applied to multiple behaviors at the same time,an adaptive cosine similarity screening method was proposed.Aiming at the problem that multi-scale temporal convolution affects the distribution of spatio-temporal features of behavior,the dynamic time Warping algorithm is integrated to extract the features of multiple time scales without affecting the distribution of features.(3)A human skeleton feature extraction software based on Open Pose is developed.Aiming at the problem that it is difficult to obtain skeleton information when making self-made behavior skeleton data sets,this thesis uses Qt Creator to develop the frontend interface of the software based on Open Pose,and uses Python to develop the backend function of the software.The extraction,display and storage of human skeleton coordinates and key angles in camera video and local video are realized.The usability of the software is verified by testing.In summary,aiming at the research difficulties of skeleton-based human action recognition,this thesis conducts research from two aspects: spatio-temporal feature extraction and skeleton information acquisition.Firstly,an action recognition method based on spatio-temporal tensor fusion was proposed,which combined empirical features and deep learning features to extract the spatio-temporal fusion features of human actions,and then a deep random configuration network with higher modeling efficiency was used to replace the fully connected layer to recognize actions.Secondly,an adaptive cosine similarity screening method was proposed,and the dynamic time warping algorithm was introduced,and then an adaptive behavior recognition algorithm based on multi-scale dynamic warping was proposed.Finally,a human skeleton feature extraction software based on Open Pose was developed to achieve the acquisition of human skeleton coordinates and key angles in video frames,so that researchers could obtain behavior skeleton information more conveniently.This thesis includes 32 figures,9 tables and 68 references.
Keywords/Search Tags:activity recognition, human skeleton, keyframe, attention mechanism, spatial-temporal feature
PDF Full Text Request
Related items