| With the rapid development of deep learning,the transmission quality of multimedia devices is getting higher and higher.However,there are still many problems in the process of capturing,transmitting,and saving a large number of videos,resulting in poorer final video quality.Video super-resolution reconstruction algorithms are required to obtain clearer content.In the current related research,the video super-resolution reconstruction algorithm based on deep learning has become the mainstream research method in this field.This technology has been widely used in the fields of intelligent security,video restoration and medical imaging.This paper conducts related research on the end-to-end video super-resolution reconstruction network.Compared with the super-resolution reconstruction of a single image,the video super-resolution reconstruction needs to use the associated information between adjacent video frames.Aiming at the problem of insufficient timing feature extraction in the existing video super-resolution reconstruction algorithms,this paper proposes a timing feature enhancement model based on 3D convolution.By introducing 3D convolution,the timing information between adjacent videos is extracted.The features of optical flow network are supplemented.At the same time,in order to reduce the smoothing phenomenon in the reconstruction process and enhance the convergence of the network,this paper selects the Mean Absolute Error loss function to train the network.The experimental results show that the full utilization of the timing features between adjacent frames can effectively improve the reconstruction quality of the network.Aiming at the problem that the feature points which are far away in the reconstruction process cannot be effectively used,this paper introduces the non-local operation called Non_Local module in the reconstruction network to capture the long-distance pixel information in the video frame,increase the receptive field of network,and enhance the perception ability of network.At the same time,in order to address the problem of inefficient feature information fusion,this paper proposes a video super-resolution reconstruction model based on multi-scale feature fusion.Through different cascading methods,the feature information extracted by different scale convolution kernels is fully fused.Utilizing,the reconstructed result can contain more high-frequency detailed features.Through comparative experiments on the standard dataset,it can be found that the texture information of the video frame sequence reconstructed by the method proposed in this paper is more abundant,and both the objective evaluation index and the subjective visual experience of the human are significantly improved,thus verifying the method in this paper effectiveness. |