Human Action Recognition Based On Two-Stream Network

Posted on:2020-06-08

Degree:Master

Type:Thesis

Country:China

Candidate:X Bai

Full Text:PDF

GTID:2428330575464397

Subject:Information and Communication Engineering

Abstract/Summary:

PDF Full Text Request

With the development of the times,human beings begin to pursue a more intelligent life.Human action recognition,as an indispensable research direction to achieve artificial intelligence,has attracted extensive attention from research communities and business circles.Traditional action recognition methods are mainly based on hand-crafted features,which have various limitations and cannot meet the needs of human beings at present.In recent years,deep learning has achieved great success in the field of computer vision,which also provides a new way for the research of action recognition.A large number of human action recognition methods based on deep learning have been proposed,which promotes the application and development of action recognition.This paper focuses on the research of the human action recognition method based on the two-stream network,and improves the existing methods based on twostream network in terms of the recognition accuracy and processing speed,respectively.The main work is as follows:Firstly,a spatiotemporal heterogeneous two-stream network based on long-range temporal structure modeling is proposed.Considering that human recognition and understanding of appearance and motion are two completely different processes,while most existing two-stream network models adopt the same structure for spatial and temporal networks.Therefore,this paper proposes a spatiotemporal heterogeneous two-stream network,which uses two different network structures to process spatial and temporal information.In order to maximize the performance of spatiotemporal heterogeneous two-stream networks,ResNet and BN-inception are used as basic networks to extract more discriminant spatiotemporal features.In addition,a segmental architecture is employed to model long-range temporal structure over video sequences to better distinguish the similar actions owning subaction sharing phenomenon.Moreover,combined with the strategy of data augment,a modified cross-modal pre-training strategy is proposed to further improve the recognition accuracy.Experiments on UCF101 and HMDB51 datasets demonstrate that the proposed spatiotemporal heterogeneous two-stream network outperforms the spatiotemporal isomorphic two-stream networks and other related methods.Secondly,aiming at the problem of high computational cost and poor real-time performance of optical flow in current two-stream method,a real-time action recognition method based on enhanced motion vector is proposed.By replacing optical flow with motion vectors,a Spatiotemporal Heterogeneous Two-stream Network Based on Motion Vector(MV-STH)network is constructed,which reduces the computational complexity and realizes real-time processing of video sequences.Motion vectors are widely used in various video compression standards.They can be directly obtained by decoding without additional calculation.However,motion vector lacks fine structures,leading to the evident degradation of recognition performance.Thus,a knowledge transfer strategy is introduced to initialize MV-STH network using the pre-training model learnt from optical flow,which is called Spatiotemporal Heterogeneous Two-stream Network Based on Enhanced Motion Vector(EMV-STH)network.This method achieves a comparable recognition performance to some stateof-the-art approaches on UCF-101 and HMDB-51.More importantly,the processing speed is about 13 times of the spatiotemporal heterogeneous two-stream network based on optical flow.

Keywords/Search Tags:

Human action recognition, Two-stream Network, Spatiotemporal Heterogeneity, Long-range Temporal Structure, Pre-training, Motion Vector, RealTime Processing

PDF Full Text Request

Related items

1	Research On Human Action Recognition Method Based On Deep Learning
2	Temporal Action Localization And Action Recognition Based On Deep Learning
3	Action Recognition Based On Spatiotemporal Attention Depth Model
4	Research On Human Action Recognition Method Based On Spatiotemporal Features
5	Research On Anomalous Human Action Detection Based On Two-stream Spatiotemporal Residual Networks
6	Action Recognition Of Human Skeleton Motion Sequences Based On Deep Learning
7	Research On Spatiotemporal Two-Stream Human Action Recognition Method Based On Skeleton
8	Research On Human Action Recognition Based On Motion Sequence Features
9	Human Action Recognition Based On Two-stream Convolutional Network
10	Human Action Recognition Based On Spatiotemporal Two Stream Convolution Network