Research On Video Human Behavior Recognition By Fusing Time Series And Spatial Features

Posted on:2023-12-30

Degree:Master

Type:Thesis

Country:China

Candidate:S P Wu

Full Text:PDF

GTID:2568306785464174

Subject:Information and Communication Engineering

Abstract/Summary:

PDF Full Text Request

In recent years,with the development of computer hardware and software technology,video data has grown exponentially on the Internet,and human behavior recognition based on video is a major component for effective management and analysis of video data.In this paper,we use the theory related to deep learning to focus on the problem of accurately recognizing human behavior in videos in two dimensions: time series and spatial features.The main research contents are as follows:(1)To address the problem of poor recognition by directly modeling the one-dimensional vector features outputted from the fully connected layer in time series,this paper uses a convolutional long and short-term memory neural network(ConvLSTM)to model the feature maps outputted from the convolutional layer in time series taking into account spatial information.In order to capture behavioral actions more accurately,the long short-term memory neural network(LSTM)is used for further video description of the features output from the ConvLSTM.Attention mechanisms are also incorporated into the feature extraction network to extract features that are useful for behavioral recognition,and the optimal timing of the incorporation is explored.(2)To solve the problem of low recognition accuracy caused by directly using the output features of the last moment of the LSTM network to represent the whole video features,this paper designs an aggregation network to do adaptive aggregation of the output of all time points of the LSTM: firstly,the input features are scanned to get the weight coefficients;secondly,the input features are integrated into the aggregation vector according to the weight coefficients and the aggregated feature vector until the scanning is completed to get the the final video description.The improved human behavior recognition model achieves an accuracy of 91.26% on the dataset UCF101,which is 5 percentage points better than the direct modeling approach using the last moment output features of the LSTM network to represent the whole video features.(3)To make full use of both spatial and temporal information,this paper uses weight fusion to fuse the recognition results of the feature map modeling using ConvLSTM and LSTM on the output of the convolutional layer with the recognition results of the onedimensional global feature modeling using LSTM and adaptive network on the output of the fully connected layer,and the fused features are fed into the classifier to obtain the final recognition results.The fusion of spatial and temporal information further improves the recognition effect,and the recognition accuracy reaches 95.68%.

Keywords/Search Tags:

Human behavior recognition, Attention mechanism, Convolutional long shortterm memory neural network, Adaptive network, Feature fusion

PDF Full Text Request

Related items

1	Design And Implementation Of Human Behavior Recognition System Based On Pose Estimation And Graph Convolutional Neural Network Construction
2	Speaker Emotional State Recognition Based On Speech And Text Fusion
3	Research On Human Action Recognition Method Integrating Visual Attention Mechanism And Deep Learning
4	Research On Human Behavior Recognition Based On Improved CNN-LSTM
5	Double Interactive Behavior Recognition Based On RGB And Depth Information Fusion
6	Research On Recognition Algorithm Of Human Abnormal Behavior Based On Video
7	Chinese Sign Language Recognition Based On Convolutional Network And Long Short Term Memory Network
8	Research On Human Behavior Recognition Method Based On Graph Convolutional Networks
9	Human Action Recognition Based On Multi-Feature Fusion
10	Research On Chinese Text Classification Method Based On Attention Mechanism And Multi-feature Fusion