| As an important topic of computer vision,human behavior recognition has been attracting the attention of scholars from all walks of life,and has been widely applied in intelligent surveillance,human-machine interaction,video retrieval,motion analysis,etc.There are many difficulties in human behavior recognition,such as occlusion,intra-class change,interclass similarity,scale change,illumination change,background disturbance,etc.It is difficult to conduct human behavior recognition based solely on RGB video data stream.With the popularity of Kinect and other devices,deep data and skeletal joints data are becoming more and more easy to obtain.Because the skeleton joints data can build the human body model well,more and more scholars begin to pay attention to human behavior recognition based on the skeletal joints.This paper is to do research on behavior recognition based on the skeletal joints data.The human behavior recognition studied in this paper includes three parts: action recognition,real-time action detection,and real-time action segmentation.Action recognition is to identify action categories on a segmented video sequence;real-time action detection is to detect in real time whether an action occurs or not on an unsegmented video sequence;real-time action segmentation is to distinguish the category of the action and the start and the end frame of the action in real time on an unsegmented sequence.In the past,most scholars have paid much attention to how to conduct action recognition,while less attention is paid to real-time action detection and real-time action segmentation.With the popularity of Kinect and other devices,real-time action detection and real-time action segmentation have also begun to attract the attention of some scholars in recent years.The work of this article is as follows:In action recognition,the Moving Pose Descriptor based on the Taylor median theorem and the body joint coordinates can represent the action of the human well.Similarly,according to Taylor's mean value theorem and body joint angle,the Moving Angle Descriptor is proposed in this paper.The Moving Angle Descriptor is a cascade of angle,angular velocity,and angular acceleration.The Moving Pose Descriptor focuses on the changes in the coordinates of the human body's joints,and the Moving Angle Descriptor focuses on the changes of the human body's joint angle.Each of the two has its own advantages.In order to combine the advantages of the Moving Pose Descriptor and the Moving Angle Descriptor,they are weighted at the descriptor level.The weighted fusion descriptor is called the Fused Moving Descriptor.Thebag-of-words model is constructed based on this descriptor to conduct action recognition.In real-time action detection and real-time action segmentation tasks,the ELS algorithm is very effective,but it ignores the case where maximum subsequence sums of multiple action categories are equal to or greater than its own thresholds.In this case,the ELS algorithm can not distinguish which action has occurred.Aiming at the shortcoming of ELS algorithm,this paper puts forward two methods,the first method is Record Sequential.This method records the first action class which is equal to or greater than its threshold and uses this as the output judgment.The second method is Calculate Threshold Ratio which calculate the proportion of the maximum subsequence sum exceeds the threshold and selects the action category that exceeds its maximum proportion of their own threshold as the output judgment.Experiments on MSR-Action3 D data sets and MSRC-12 data sets demonstrate that the Moving Angle Descriptor proposed in this paper can effectively represent human action.The Fused Moving Descriptor effectively integrates the advantages of the Moving Pose Descriptor and the Moving Angle Descriptor.The Fused Moving Descriptor has a higher classification accuracy than that just use Moving Pose Descriptor and the Moving Angle Descriptor alone.In the experiments of real-time action detection and real-time action segmentation,the performance of the ELS algorithm and the two improved methods are compared.It is concluded that the method of Calculate Threshold Ratio can better handle the case where maximum subsequence sums of multiple action categories are equal to or greater than its own thresholds.And this is called i ELS(improved Efficient Linear Search)algorithm in this paper. |