Font Size: a A A

Video Classification Technology Based On Deep Learning From An Audio Perspective

Posted on:2022-01-21Degree:MasterType:Thesis
Country:ChinaCandidate:Y SunFull Text:PDF
GTID:2518306323460274Subject:Software engineering
Abstract/Summary:PDF Full Text Request
For a long time,video classification is a popular research field that scholars in the field of computer vision pay more attention to,and it is also full of many challenges.People pay more and more attention to video classification methods based on deep learning.The main reason is that it can extract image feature information from video frames and use special network structures for parameter training and optimization,and extract image features for video classification..In recent years,deep learning technology has been favored by many researchers,and it has also played an active and effective role in the field of video classification.With the development of communication technology,the development of the network field is also improving.At the same time,the field of self-media applications is also developing rapidly.People are more willing to publish interesting things in daily life to the Internet.This phenomenon has appeared in the classification of video,video screening and The advancement of retrieval technology has proposed new research directions.In this realistic background,this article is mainly based on deep learning algorithms to study the short video classification problem.First introduce the types of data sets used and related theoretical technologies,mainly introduced the theoretical algorithms including audio feature processing technology and deep learning,followed by a detailed introduction of several commonly used video classification methods,mainly including:Convolutional Neural Network And recurrent neural networks.Then it elaborated on several well-known classification network model structures.The fourth chapter of this paper introduces the hybrid neural network model structure designed in this paper in detail.The designed network combines the C3 D network model and the LSTM model,and changes the traditional network structure.Use a multi-modal hybrid neural network to solve the video classification problem.This article focuses on how to use hybrid neural networks to improve the accuracy of short video classification.The focus of the network module is a hybrid neural network composed of a long and short-term memory model and a three-dimensional convolutional neural network.A new network model is designed to study the low-precision classification categories in the UCF101 data set.By integrating various useful clues,to capture the semantic features of the frame.The short-term memory network and the threedimensional network spatio-temporal features are used to extract motion features to identify the classification of short videos,and the training model is continuously improved.At the end of the paper,experiments were conducted based on the design ideas and network structure.The experimental results show that the hybrid neural network combination method we proposed is accurate and effective.
Keywords/Search Tags:video classification, multi-modal features, convolutional neural network, LSTM
PDF Full Text Request
Related items