Research And Implementation Of Action Recognition Based On Deep Learning

Posted on:2020-11-28

Degree:Master

Type:Thesis

Country:China

Candidate:M Yu

Full Text:PDF

GTID:2428330620456365

Subject:Electronic Science and Engineering

Abstract/Summary:

PDF Full Text Request

With the development of artificial intelligence and Internet of Things,human action recognition has great demand in video surveillance,human-computer interaction,virtual reality,motion analysis and other fields.Based on the great success of image classification,researchers have applied the deep learning method to the field of action recognition.However,there are problems such as the fact that the dynamic characteristics are not efficient enough,the multimodal information can not be fully utilized,and the application cannot be deployed.In order to fully exploit the temporal information and utilize complementarity of multiple modalities,efficient feature representation and feature fusion methods are studied to improve the accuracy of action recognition.There are three main contributions.(1)Firstly,a novel dynamic feature expression called Human-Object Contour(HOC)is presented.HOC is used to represent the dynamic information based on essence of optical flow,containing higher-order semantic information of object category.The dynamic logic information in the video can be fully exploited to optimize the optical flow.(2)Secondly,an extensible fusion network called Attentional Multi-modal Fusion Network(AMFN)is proposed,referring to the principle of Stacking in integrated learning.Learning from the selective attention mechanism of human vision,characteristic of each video itself is combined,realizing the maximization of multi-modal information.(3)Thirdly,the action recognition application is implemented on the embedded development platform Jetson TX2,which attempt to combine HOC to improve accuracy.Besides,the TensorRT engine is used to accelerate the inference process,greatly increasing the value of action recognition applications.Experimental results draw following conclusions:(1)HOC is effectively complementary to RGB static feature,and helps action recognition in over 60% categories.(2)Combining HOC and AMFN,the approach obtains outstanding performance on the datasets of HMDB51(72.2%)and UCF101(96.0%).(3)With acceleration of TensorRT,the speed of forward calculation on Jetson TX2 is increased from the 27 FPS to 153 FPS.In summary,the experimental results meet the indicators.

Keywords/Search Tags:

action recognition, deep learning, feature representation, feature fusion, video understanding

PDF Full Text Request

Related items

1	Research On Video Action Recognition Method Based On Spatio-temporal Feature Modeling
2	Analyzing And Understanding Human Actions In Videos
3	Deep Convolutional Video Representation Learning
4	Research On Video Action Recognition Method Based On Spatial-Temporal Feature Fusion And Deep Learning
5	Research And Implementation Of Video Action Recognition Based On Feature Fusion And Hybrid Attention Mechanism
6	Research On Action Recognition Method Based On Motion Feature Extraction And Spatio-temporal Feature Fusion
7	Research On Human Action Recognition Method Based On Deep Learning
8	Research On Representation-level Features Extraction And Fusion Classification Method Of Human Actions In Video Sequences
9	Research On Video Action Recognition Based On Spatial-temporal Feature Fusion
10	Research On Key Technologies Of Deep Representation Based Visual Understanding