Research On Video Action Recognition Based On Deep Learning

Posted on:2020-08-19

Degree:Master

Type:Thesis

Country:China

Candidate:M An

Full Text:PDF

GTID:2428330578966555

Subject:Information and Communication Engineering

Abstract/Summary:

PDF Full Text Request

The video action recognition technology uses the computers to analyze and identify the human actions in video sequence,which is a research hotspot in the field of computer vision.It has very important research significance and broad application prospect in the fields of intelligent monitoring,smart home,abnormal human behavior detection and human-computer interaction.Currently,the action recognition model proposed only uses video appearance information and short-term motion information,and lacks the ability to learn the dependence between long time series.Therefore,this paper further studies and improves the method of action recognition to make it more suitable for real life.To solve the problem of human action recognition in video,this paper uses two independent convolutional neural networks to extract the spatial and temporal information of video sequence respectively,and then combines the long short-term memory neural network(LSTM)to form Long-term Recurrent Convolutional Networks(LRCN)to recognize human action in video.LSTM unit is used to introduce the dependency between video sequences,so that LRCN network can process video sequences with long time structure.Experimental results show that the LRCN action recognition model has good robustness and generalization ability,and the application of LRCN model in power system is discussed in this paper.When the LRCN action recognition model processes long time video sequences,it adopts the dense sampling frame sequence strategy,which is easy to generate a large amount of redundant information and increase the network computing cost,and it cannot learn the remote time structure well of complex actions over long time.In this paper,the Temporal Segment Networks(TSN)with sparse sampling of the whole video was used as the basic model to replace the LRCN network for remote time modeling.TSN network combines temporal pyramid pooling method to form the TSN-TPP network to realize human action recognition.It is able to aggregate frame level features of multiple time scales into fixed length video level feature,which enhances the weak time structure in video.Experimental results show that this method can effectively improve the accuracy of action recognition.Finally,this paper also transplanted the action recognition model of LRCN network and TSN-TPP network to Jetson TK1,an embedded platform based on GPU acceleration,to realize human action recognition on the front-end devices and reduce the pressure on the server terminal to process a large amount of video data.

Keywords/Search Tags:

action recognition, long-term recurrent convolutional networks, temporal segment networks, temporal pyramid pooling, embedded GPU

PDF Full Text Request

Related items

1	Research On Action Recognitions Based On Spatio-temporal Context Modeling
2	Research On Language Identification Based On Temporal Feature Representation
3	Research On Video Action Recognition Algorithm Based On Spatio-Temporal Features With 2D Convolutional Neural Networks Framework
4	Research On Temporal Action Detection In Video
5	Research On Video Action Recognition Technology Based On Spatiotemporal Feature Extraction
6	Algorithm Of Complex Action Recognition Based On Temporal Proposals
7	Research On Temporal Action Location Method Combining Light And Heavy Networks In Untrimmed Video
8	Human Skeletal Action Recognition Based On Deep Learning
9	Research On Human Action Recognition Based On Adaptive Graph Convolutional Networks
10	Research On Human Action Recognition In Videos