Font Size: a A A

Research Of Human Behavior Detection Method Based On Spatiotemporal Collaboration In Natural Scenes

Posted on:2024-03-26Degree:MasterType:Thesis
Country:ChinaCandidate:M Y TaoFull Text:PDF
GTID:2568307127463624Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
With the development and application of intelligent monitoring and human-computer interaction technology,research on human behavior detection has become one of the hot research topics in the fields of computer vision and has received widespread attention from scholars.Different from behavior classification based on edited video,human behavior detection requires to automatically detect behavior clips and perform behavior recognition in real scenes.However,due to the effect of factors such as complex background,unequal duration of behavior,high similarity and diverse categories,the accuracy of human behavior detection needs to be further enhanced.To improve the performance of human behavior detection,end-to-end deep learning networks have been gradually applied to human behavior detection methods,with networks represented by SSN,LGN,and R-C3 D.But due to the influence of environment,lighting,and occlusion,the behavior features extracted by these networks have high redundancy.Moreover,because of the different durations of various behaviors,these networks have weak ability to capture long-distance dependency relationships and spatiotemporal contextual information.These factors restrict the accuracy of boundary location and behavior classification.To further promote the performance of human behavior detection methods,this article mainly conducts the following research works:(1)To address high redundancy of feature extraction and insufficient capture of longdistance dependencies among behaviors in the R-C3 D behavior detection network,the article proposes an improved behavior detection network(i.e.RS-STCBD),which combines residual shrinkage structure and spatiotemporal context.The network consists of a feature subnet with an embedded adaptive residual shrinkage mechanism,a proposal subnet based on a multilayer convolution strategy,and a classification subnet with a Non-Local attention mechanism.In the feature subnet,a shrinkage structure and soft-thresholding operations are embedded into3D-Res Net to construct a residual shrinkage unit with channel-adaptive thresholds(3DRSST)and cascade multiple 3D-RSST units to enhance the effectiveness of behavior feature extraction.In the proposal subnet,multilayer convolution replaces the single convolution to increase the temporal dimension receptive field of the temporal candidate fragments.In the classification subnet,a Non-Local attention mechanism is applied to capture long-distance dependencies among high-quality behavioral temporal fragments.Experimental results on the THUMOS’14 and Activity Net1.2 datasets show that the m AP@0.5 of RS-STCBD reaches36.9% and 41.6%,respectively,which is an improvement of 8.0% and 14.8% over the R-C3 D method.(2)Aiming at the inability of R-C3 D behavior detection network to capture multi-level and multi-granularity spatiotemporal context information of behavior,this article further proposes a behavior detection network with spatiotemporal symmetrical multiscale structure(i.e.RS-STSM).The network embeds a spatiotemporal symmetrical multiscale structure in the proposal subnet,which aims to obtain spatiotemporal symmetrical multiscale motion features with different levels and granularity by extending the spatiotemporal dimensional perceptive field of the behavioral feature map.Meanwhile,a Soft-NMS strategy is employed in the classification subnet instead of the non-maximum suppression strategy to filter highquality temporal segments.Experimental analyzes have also been performed on the THUMOS’14 and Activity Net1.2 datasets.The results indicate that the m AP@0.5 of improved RS-STSM network achieves 39.4% and 42.2% on the two datasets,enhancing10.5% and 15.4% over R-C3 D,respectively.Compared with related methods,the improved RS-STCBD and RS-STSM networks have better performance in the accuracy of action boundary localization and behavior classification.Therefore,the improved behavior detection network is beneficial to promote the quality of human-computer interaction in natural scenes.
Keywords/Search Tags:Human behavior detection, Residual adaptive shrinkage mechanism, The spatiotemporal context, Non-local attention mechanism, Spatiotemporal symmetrical multiscale structure
PDF Full Text Request
Related items