| With the development of video technology and the popularity of display devices,a large number of videos will be produced on the Internet every day,and people’s demand for video editing is also increasing.Among them,video super resolution and video inpainting,as two common video restoration methods,have a wide range of application scenarios.Existing video restoration methods mostly rely on sliding window approaches to process image sequences.These methods are limited by the length of the input sequence and cannot fully utilize the temporal information within the video.In recent years,there have been algorithms that utilize Transformers to model image sequences.Although these algorithms effectively leverage temporal information,the partitioning strategy results in the loss of fine-grained details.How to better use the temporal information and fine-grained information in the video to design the deep neural network model corresponding to the task is the main research content of this thesis.For this reason,we propose a basic model of bidirectional recurrent neural network to model these two problems.On the basis of this model,special designs are made for individual tasks.The video super resolution algorithm can improve the resolution of video and enrich the details of video.In this thesis,a bi-directional recurrent neural network based on convolution is proposed.First,the feature sequence of the frame is extracted by the convolution module,and then the feature in the temporal domain of frame sequence is fused by the recurrent input strategy.Finally,the target frame is reconstructed.The whole model adopts end-to-end training method,and with the help of gradient feedback in time step,it can implicitly learn the motion relationship between frames,and better fuse the temporal domain features.At the same time,we introduce channel distillation and channel attention mechanism into the convolution,which further improves the learning ability of the model and reduces the amount of computation.Through a series of experiments,it is proved that our model can still produce better results with less parameters.The video inpainting algorithm can generate natural video content under the condition that each frame mask area is given to achieve the completion effect.It can be used in scenes such as disharmony removal and watermark removal.In this work,we propose a dense recurrent neural network(RNN)model for video inpainting.The goal of our method is to simultaneously exploit both the temporal coherence and fine-grained information in video in one model.For that purpose,on one hand,we employ dense optical flow to explicitly align the features of video frames to capture the fine-grained information that is useful for the synthesis of missing content.On the other hand,we also take advantage of RNN to iteratively process video frames to integrate temporal coherent information.The RNN is designed to be bidirectional and cascaded multiple times with dense connections,such that the temporal coherence of the video can be sufficiently traversed and integrated.Furthermore,we introduce a two-stage learning strategy,where two dense RNN models are used to synthesize the structure and texture information subsequently,which forms our final Coupled Dense RNN(CD-RNN).We conduct extensive experiments on benchmark datasets,where our proposed approach outperforms existing state-of-the-art video inpainting methods in terms of both effectiveness and efficiency. |