Font Size: a A A

The Application Of Deep Learning In Image Processing

Posted on:2021-05-11Degree:MasterType:Thesis
Country:ChinaCandidate:Y LiuFull Text:PDF
GTID:2428330620464067Subject:Engineering
Abstract/Summary:PDF Full Text Request
With the continuous development of deep learning,various intelligent applications have emerged,such as face recognition,intelligent security,new retail and so on,covering various scenarios in human production and life.Therefore,researchers have been devoted to the exploration of the essence of deep learning.LeCun,the father of convolutional neural networks,once said that the essence of intelligence is prediction.This sentence has established the status of predictive learning in the field of artificial intelligence.Video is a continuous sequence of image frames.Predictive learning of video refers to the learning of historical frames to generate future image frames.Because the task of video prediction does not need human annotation data,and there are a lot of video data available in our life,it has attracted a lot of attention in recent years.Currently,video prediction is used in weather forecast,urban traffic flow forecast,intelligent robot driving and other scenes.After investigation,it was found that most of the video prediction models in deep learning are built by using convolutional neural networks and recurrent neural networks.This paper studies video frame prediction algorithms based on long short-term memory networks and proposes two different improved methods.One is a model built by combining a pre-trained convolutional autoencoder with a convolutional long and short-term memory network in the form of sequence to a sequence.The model is designed to prevent the image spatial information from being damaged by stretching,and its parameters are very small,which is suitable for a simple prediction task.Experiments show that the algorithm has better prediction performance than the fully connected structure.The other is to propose a space-time differential long short-term memory network for the situation where the prediction scenario is more complicated.The internal unit of the network contains a dual time memory mechanism,which innovatively uses a differential gate structure to control the memory difference of input adjacent frames,so as to improve the ability of the network to capture information of distant time steps.By stacking the network to form a video frame prediction model,experiments were performed on the Moving-MNIST synthetic dataset,TaxiBJ traffic flow dataset and KTH action recognition dataset.In addition,other stackable prediction algorithms based on long short-term memory network are reproduced in this paper,and the experimental results show that the proposed space-time differential long short-term memory network has the best prediction effect on multiple datasets compared with the reproduced algorithm.This paper also extends the video prediction model,builds a video action recognition model based on space-time difference long short-term memory network,uses the model parameters obtained in the video prediction task as pre-training,and then carries out action recognition.Compared with the video action recognition algorithm combined with convolution neural network and long short-term memory network,the proposed algorithm has a certain improvement in recognition accuracy.Experiments show that the features extracted from the video frame prediction task can be generalized to other tasks.
Keywords/Search Tags:Video frame prediction, Differential gate, Dual time memory, Space-time differential long short-term memory(ST-DLSTM) network
PDF Full Text Request
Related items