| With the advancement of science and technology,society has begun to move closer to the era of information and intelligence.Human gesture recognition is a hot issue in the field of computer vision in recent years.It plays an indispensable role in high-tech industries such as wearable computing,intelligent monitoring,and animation modeling,and has a very broad application prospect.The current research can be divided into two directions: one is based on RGB video,and the other is based on 3D human skeleton sequence.This article first studies the human body gesture recognition based on RGB video,in this direction is mainly divided into two parts,the first is to track the human body frame in each frame of the video,and the second is to recognize the posture of the object in the human body frame.The key lies in how to accurately achieve target tracking and object feature extraction.Therefore,a multi-detector fusion multi-target tracking algorithm is proposed to solve the problem of inaccurate human posture recognition caused by human object tracking errors.Different from human body gesture recognition based on RGB video,in skeleton-based human posture recognition,the change of human body posture over a period of time is represented by the body skeleton of each frame.Therefore,how to accurately extract the temporal and spatial characteristics of the skeleton sequence at the same time becomes the key to accurately describe the posture change.Most of the existing studies start with the human skeleton.These methods are limited to the structure of the human skeleton itself,ignoring the influence of the relationship between each joint point in the skeleton on the movement.,and the direct modeling of the original skeleton sequence will be subject to the perspective the impact of changes in the distribution of joint points.Therefore,this paper proposes a new method for extracting human skeleton features based on complex network.The relationship between the joint points in the human skeleton of each frame is coded as a network,and the changes in the body posture over a period of time are composed of joint points the time series network is described,the idea of complex network is used,the skeleton information is represented by network features,and the LSTM,CNN and other methods are combined to develop CNC-LSTM and CN-CNN+LSTM algorithms,which are suitable for skeleton-based human posture recognition.Specific innovations content including:1.A multi-detector fusion deep correlation filtering video multi-target tracking algorithm has been proposed,which is applied to the human body gesture recognition based on RGB video,and solves the influence caused by the wrong selection of the human body in the human body gesture recognition process.The algorithm proposes a new fusion mechanism that uses the information of multiple detectors to reduce the number of missed and false detections caused by a single detector,breaks the limitations of the performance of a single detector,and makes the acquisition of new targets more reliable.On the other hand,the deep correlation filtering algorithm ECO is used to track the targets one by one,and a series of improvements are proposed on the basis of the original algorithm ECO,which is more suitable for the task of video multi-target tracking.2.The complex network is applied to the feature extraction of human skeleton,and a skeleton-based human posture network model is constructed,which describes the evolution of human posture over time.In the process of human posture recognition,for different human postures,the evolution of the network should also be different,which lays the theoretical foundation for the complex network to be able to recognize different human postures.The human body posture network model is used to describe the interaction between the joint points of the human body during the process of the human body posture.Therefore,for the same human body posture,different objects and different perspectives will not have too much influence on the human body posture,so as to well solve the problem of the size of the skeleton sequence and the uneven distribution of joint points caused by the change of viewing angle.3.Taking into account the particularity of the human skeleton,the original network has been improved.The entire human skeleton is divided into five parts,namely the trunk,left arm,right arm,left leg and right leg.Since from the kinematics point of view,the importance of different parts to the human body posture is not the same.When constructing the relationship model between the joint points,different weight matrices are added to different parts of the body.The influence of each joint on the posture of the human body becomes more reasonable,thereby reducing the influence caused by the excessive influence on the posture of the human body between the joint points of the same part.In addition,due to the excellent performance of the convolutional neural network in the field of image recognition,after the skeleton sequence is transformed into a human pose network diagram,the convolutional neural network is used to extract the space between the joint points from each frame of the human pose network diagram.Then,the spatial feature is used as the input of the LSTM,and the time feature is extracted through the LSTM.This method of fusing the two features with each other for human body posture recognition has higher accuracy.4.A human pose recognition model based on complex network coding and long and short-term memory neural network is designed.In this model,the complex network coding is used to replace CNN to represent the human skeleton of each frame,and the topological properties of the human pose network are calculated.To describe the entire time series network,the topological attributes are divided into the topological attributes of each node,including closeness centrality,betweenness centrality,weighting degree,eigenvector centrality,and the topological attributes of the entire network including average closeness centrality,average degree,network diameter,average path length,and combine them into feature vectors to describe the skeleton sequence.Then extract the temporal features of the human skeleton through LSTM,and finally combine the temporal and spatial features to jointly represent a continuous human posture.This method has been tested on the public body gesture recognition data sets NTU RGB+D,UTKinect-Action and MSR Action 3D.The results show that the proposed method has advantages in recognition accuracy and running time.This article first studies the human body gesture recognition based on RGB video,improves the human object tracking,and lays the foundation for the follow-up human body gesture recognition,then focuses on the research of human body gesture recognition based on3 D human skeleton.In order to solve the problem that the existing research is limited to the human skeleton structure itself,the complex network is applied to the human body gesture recognition.In the process of research,the proposed model is continuously improved.First,the complex network model is directly extracted by CNN,and then the skeleton is represented by complex network coding instead of CNN. |