Font Size: a A A

Research On Human Motion Detection System Based On Spatio-Temporal Feature Fusion

Posted on:2023-02-21Degree:MasterType:Thesis
Country:ChinaCandidate:W L LiuFull Text:PDF
GTID:2558306905468064Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of deep learning technology and computer hardware devices,human motion detection has gradually become a research hotspot in the field of computer vision.This technology will be gradually applied to security monitoring,intelligent medical care,human-computer interaction and other fields.The difficulty of human action recognition lies in understanding and modeling the relationship between temporal and spatial contexts.In this paper,we propose a human action localization and recognition algorithm based on spatiotemporal feature fusion,and complete the design of human action detection system.The main work is as follows:First,the algorithm in this paper absorbs the ideas of two-stream action recognition network and target detection algorithm,and proposes a new action recognition network architecture.A new feature extraction network architecture is designed for the characteristics of action recognition tasks,using 2D CNN and 3D CNN parallel connection as the feature extraction network,and the extracted features are fused in feature depth by the traditional channel space attention mechanism module.In order to achieve the determination of the spatial location of the action performer using YOLO V3 Head for action localization and action category determination.The architecture design of the action detection algorithm is completed and the Baseline is provided for the subsequent research of this paper.Second,in order to solve the problem that the traditional channel space attention mechanism module cannot achieve feature interaction across different scales of features,three feature interaction modules are designed based on the Non_Local algorithm,which are selffeature interaction module,top-down feature interaction module,and bottom-up feature interaction module.The feature fusion network based on feature pyramid is built using the above three modules,and its application to spatio-temporal features at different scales can realize cross-space,time and scale interaction,which enhances the association ability between targets at different scales.The improvement of network performance is achieved.Finally,this paper completes the production of the embedded hardware platform and uses it as the front-end image acquisition device.The equipment adopts Rockchip RK3399 chip as the main control chip.According to the requirements of the system in this paper,the hardware and interface are selected.In order to improve the operation speed of the model,this paper designs a lightweight feature extraction network based on channel-by-channel convolution,point-by-point convolution and Ghost convolution,and completes the deployment of human action detection algorithms on the server side.Using Socket communication technology,the front-end image acquisition equipment based on Rockchip RK3399 is communicated with the server to realize the design of the overall motion detection system.
Keywords/Search Tags:Human motion detection, Spatiotemporal feature pyramid, Spatiotemporal feature fusion, Attention mechanism, RK3399
PDF Full Text Request
Related items