Font Size: a A A

Research And Implementation Of Action Recognition Based On Deep Learning

Posted on:2020-11-28Degree:MasterType:Thesis
Country:ChinaCandidate:M YuFull Text:PDF
GTID:2428330620456365Subject:Electronic Science and Engineering
Abstract/Summary:PDF Full Text Request
With the development of artificial intelligence and Internet of Things,human action recognition has great demand in video surveillance,human-computer interaction,virtual reality,motion analysis and other fields.Based on the great success of image classification,researchers have applied the deep learning method to the field of action recognition.However,there are problems such as the fact that the dynamic characteristics are not efficient enough,the multimodal information can not be fully utilized,and the application cannot be deployed.In order to fully exploit the temporal information and utilize complementarity of multiple modalities,efficient feature representation and feature fusion methods are studied to improve the accuracy of action recognition.There are three main contributions.(1)Firstly,a novel dynamic feature expression called Human-Object Contour(HOC)is presented.HOC is used to represent the dynamic information based on essence of optical flow,containing higher-order semantic information of object category.The dynamic logic information in the video can be fully exploited to optimize the optical flow.(2)Secondly,an extensible fusion network called Attentional Multi-modal Fusion Network(AMFN)is proposed,referring to the principle of Stacking in integrated learning.Learning from the selective attention mechanism of human vision,characteristic of each video itself is combined,realizing the maximization of multi-modal information.(3)Thirdly,the action recognition application is implemented on the embedded development platform Jetson TX2,which attempt to combine HOC to improve accuracy.Besides,the TensorRT engine is used to accelerate the inference process,greatly increasing the value of action recognition applications.Experimental results draw following conclusions:(1)HOC is effectively complementary to RGB static feature,and helps action recognition in over 60% categories.(2)Combining HOC and AMFN,the approach obtains outstanding performance on the datasets of HMDB51(72.2%)and UCF101(96.0%).(3)With acceleration of TensorRT,the speed of forward calculation on Jetson TX2 is increased from the 27 FPS to 153 FPS.In summary,the experimental results meet the indicators.
Keywords/Search Tags:action recognition, deep learning, feature representation, feature fusion, video understanding
PDF Full Text Request
Related items