Font Size: a A A

Research On Spatio-temporal Information Fusion Human Behavior Recognition Methods

Posted on:2022-09-07Degree:MasterType:Thesis
Country:ChinaCandidate:Y Q ShiFull Text:PDF
GTID:2518306554970909Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Automatic recognition of human actions in videos refers to the automatic recognition of corresponding human actions from original video data.It is a key technology for video semantic understanding and video structured description.Compared with the recognition of vehicles in the video,human behavior requires accurate recognition and due to the variety of behaviors.In the obtained video,due to various objective factors,such as video jitter,complex background,and perspective changes.Therefore,the existing body of video behavior recognition algorithms to deal with these issues is still limited.For spatio-temporal information fusion,we analyze the important characteristics of the key behavior information in the video,combined with the deep learning network model,and related machine learning algorithms for in-depth research.For realizing the detection and recognition of human behavior recognition models in video data for monitoring.The main research contents are as follows:(1)Since video data has complex and redundant information in temporal and spatial dimensions,we propose a human behavior recognition method in the video based on dynamic spatio-temporal information fusion.This method filters out important temporal information and spatial information between adjacent video frames.This module calculates the temporal and spatial differences between pixels based on temporal and spatial,and uses these differences to correct the temporal and spatial displacements on adjacent frames and capture contextual information at adjacent moments.In the current video frame,temporal and spatial probability distribution of pixels is modeled.This method improves the performance of video recognition tasks and proves its effectiveness and efficiency on public data sets.Among them,the average accuracy of this model has been improved by more than 1?2% respectively.(2)The problem of the accuracy of human behavior recognition in the video background based on the Res2 Net network model is not high.We propose an improved video behavior recognition network based on the attention module.This model adds an improved spatial attention module and an improved channel attention module in a cascaded manner.The spatial attention module directly extracts important key pixel information and channel information from the output feature map,which improves the accuracy of pixel spatial association and further aggregates information between pixels.Then,an improved channel attention module is added to suppress static background scene information,and the behavior recognition model that combines these two modules is studied.By using public data set,the experimental results show that,without increasing the amounts of FLOPs and parameters,it performs better than other network models,and the overall improved accuracy rate reaches about 2%.This model can effectively improve the accuracy of video human behavior recognition.
Keywords/Search Tags:Behavior recognition, deep learning, spatio-temporal features, feature fusion, attention module
PDF Full Text Request
Related items