| With the development of information technology,the cost of video transmission and storage is becoming lower and lower,and various video applications such as online teaching and live webcast,have enriched people’s lives greatly.The widespread application of video has led to an explosive growth of video data,which has brought a heavy burden to the storage and transmission of videos.Although researchers have proposed many video compression methods,these methods still cannot fully meet the requirements of various video application scenarios.In recent years,deep learning has developed very rapidly and achieved significant results in many fields.The great success of deep learning has led researchers to attempt to use neural networks to improve the efficiency of video coding.Currently,researchers have proposed many video coding methods based on deep learning.Even through these methods have excellent performance,there is still lots of space for optimization.For example,most video coding models based on deep learning use the same encoding method for the same type of frames in the same group of pictures,and cannot adaptively encode video frames based on different temporal information.In addition,video coding models based on deep learning basically use optical flow as motion information between adjacent frames.How to enhance the capacity of the motion estimation module for optical flow estimation is one of the key issues to improve video coding performance.To address these issues,the thesis proposes two deep learning based algorithms to improve the capabilities of deep video coding networks for optical flow estimation and adaptive coding,respectively.The specific work is as follows,1.An optical flow estimation algorithm based on denoising and mask segmentation is proposed.A denoising module and a mask segmentation module are used in this algorithm to process the input and output of the optical flow estimation,respectively.The denoising module uses convolutional layers to denoise the input video frames,and uses residual learning methods to enhance the features of the denoised video frames to compensate for the lost details of the video frames after denoising.The mask segmentation module generates a mask for optical flow segmentation through a mask generation network at first,and then the generated mask is used to divide the predicted optical flow into foreground and background optical flow.The optical flow post processing module is used to perform differential processing on the segmented optical flow,making important foreground optical flow more precise and improving the effect of motion compensation.Experimental results show that the optical flow estimation algorithm based on denoising and mask segmentation can effectively reduce the rate distortion cost of optical flow compression and improve the compression efficiency of video coding networks.2.A temporal adaptive algorithm is proposed.The algorithm converts the position information of a video frame in a group of pictures into time control information through a control information calculation module,and selects different structures of networks and time weight vectors to adaptively encode the current frame according to the obtained time control information,which improves the compression efficiency of the deep video coding model.In addition,in order to increase the versatility of the temporal adaptive module,the thesis adds a temporal interpolation module to the temporal adaptive module,which enables the temporal adaptive module to adaptively encode video frames in different size of groups of pictures without multiple training.Experimental results show that the temporal adaptive module can be used as a generic plug-in to improve the compression performance of various video coding models based on deep learning. |