Experiment Of Decoupled Operators On Two-Stream Convolutional Neural Networks

Posted on:2020-09-09

Degree:Master

Type:Thesis

Country:China

Candidate:Z H Zhao

Full Text:PDF

GTID:2417330590482853

Subject:Applied Statistics

Abstract/Summary:

PDF Full Text Request

Action recognition is a basic problem of computer vision.With the development of technology,this technique is now used in many places,such as intelligent video surveillance,virtual reality,video retrieval on the Internet,human-computer interaction and other scenarios,and has broad prospects.However,action recognition still has many problems and difficulties including how to extract powerful features,how to integrate multiple features and so on.These problems have affected the implementation of action recognition technique in engineering,so this paper will mainly discuss how to improve the accuracy and robustness of action recognition algorithms.With the success of convolutional neural networks in images in 2012,new network structures such as VGG and ResNet have emerged in recent years.There are currently many network techniques such as two stream neural networks and 3D convolution in video field.However,the current network feature expression ability is not strong enough,and the action recognition datasets are small compared with the ImageNet dataset,it is easy to over-fitting during the training process.In order to solve this problem,based on two researches' results,this paper will extend the decoupled operator to the deep-architecture two-stream convolutional neural network to discuss whether it can improve the feature expression ability.The dataset used in this paper is the UCF-101 dataset,which has 101 action categories for a total of 13,320 video clips.The VGG16 network structure is used to construct a temporal stream network and a spatial stream network.In training section,the ImageNet pretrain model are used as networks training initializations,and applying online enhancements to the data.Due to the usage of pre-training,the model uses a low learning rate,the temporal stream network uses an initial learning rate of 0.001,and decays by 1/10 every 100 iterations,and the spatial stream network uses a starting learning rate of 0.001,decays by 1/10 at 200 iterations and decays by 1/10 every 100 iterations.After training and testing,the traditional two-stream convolution network,the accuracy of spatial stream network is 73.619%,the accuracy of temporal stream network is 67.962%,The accuracy of the fused two-stream convolutional neural network is 77.134% Using the decoupled operator's two-stream convolutional network,the accuracy of the spatial stream network is 74.015%,the accuracy of the temporal stream network is 68.214%,and the accuracy of the fused two-stream convolutional neural network is 78.192%.

Keywords/Search Tags:

action recognition, decoupled operators, two stream networks, convolutional neural networks

PDF Full Text Request

Related items

1	Mathematical Arithmetic Operators Recognition Based On The Character-level Convolutional Neural Network
2	Prediction Of The Career Development Direction Of College Students Based On Convolutional Neural Networks
3	Research On Student Behavior Description Methods In Mainstream Learning Scenarios
4	Research And Implementation Of Teaching Resource Integration Method Based On Convolutional Neural Network
5	Research On Action Recognition Of Student In Classroom Based On Convolutional Neural Network
6	Graph Convolutional Networks:An Application To Open Educational Resources
7	Mixed Uncertainty Modular Neural Networks And Universities Benefit Forecast
8	Research On Online Public Opinion Recognition Based On Deep Learning
9	Image Classification Algorithm Of Convolutional Neural Network Based On Spatial Pyramid Pooling
10	The Forming And Application Of Aerobics Forecasting Mock-up Based On Atificial Neural Networks