Font Size: a A A

End-to-End Optimized Video Coding Technology Research

Posted on:2024-07-10Degree:MasterType:Thesis
Country:ChinaCandidate:J Z WuFull Text:PDF
GTID:2568307079966039Subject:Electronic information
Abstract/Summary:PDF Full Text Request
With the continuous improvement of transmission and storage technology,highquality video data is widely present in people’s lives.However,uncompressed video data often occupies a large amount of storage and bandwidth resources.Therefore,proposing and continuously optimizing video compression technology is very meaningful.Video compression technology mainly reduces signal redundancy and visual redundancy to achieve compression.In recent years,deep learning technology has achieved remarkable results in many computer vision fields.At the same time,traditional hybrid video compression algorithms have deficiencies such as discreteness and inability to optimize jointly.Therefore,using deep learning technology’s strong learning ability and non-linear ability to apply deep learning networks to video compression algorithms is worth exploring.This thesis focuses on end-to-end optimized video encoding and decoding technology.The encoding and decoding process is generally similar to traditional hybrid video encoding and decoding algorithms.This thesis proposes some improvements in motion compensation and prediction frame generation.The main work is as follows:1.Multi-scale feature-based motion compensation algorithm: This thesis proposes a multi-scale feature-based motion compensation algorithm to generate reconstruction frames more accurately and efficiently.This algorithm utilizes feature maps at different scales in the feature space for motion compensation.First,learn certain rough information on small scale features and generate rough reconstruction frames.Then generate the final finely detailed prediction frames by performing motion compensation on the large scale feature.In the experiment,the proposed algorithm was compared against HEVC,and the results revealed an average BD-Rate savings of 21.14% for the proposed algorithm.Additionally,a comparison was conducted with FVC,a state-of-the-art fully networkbased video compression algorithm known for its superior performance during the same period.The results demonstrate significant performance gains of the proposed algorithm across all test sets,with an average BD-Rate savings of 1.73%.2.Multi-reference frame-assisted prediction frame generation algorithm: This thesis proposes a multi-reference frame-assisted prediction frame generation algorithm to generate more accurate prediction frames during the motion compensation process,and to utilize more detailed information from adjacent frames to enhance the quality of prediction frames and reduce residues.This algorithm fuses the first three images and the prediction frame after motion compensation,respectively,to extract more information.In the experiment,the proposed algorithm was compared against HEVC,and the results revealed an average BD-Rate savings of 25.03% for the proposed algorithm.Additionally,a comparison was conducted with FVC,a state-of-the-art end-to-end video compression algorithm known for its superior performance during the same period.The results indicate that the proposed algorithm outperforms the end-to-end optimized video compression algorithm on various datasets,with an average BD-Rate savings of 3.86%.
Keywords/Search Tags:Video encoding and decoding, Deep learning, Multi-reference frame, Multiscale feature
PDF Full Text Request
Related items