Font Size: a A A

Research On Image Compression Methods With Learned Wavelet-Like Transforms

Posted on:2023-12-15Degree:DoctorType:Dissertation
Country:ChinaCandidate:H C MaFull Text:PDF
GTID:1528306902954489Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
With the development of digital imaging technology,communicating with digital images has become more and more popular.Compared with text,images can store more detailed information.And more importantly,since humans are particularly good at extracting information with vision system,the huge amount of information stored in images can be quickly extracted and utilized by humans.Therefore,images can greatly improve the efficiency of information utilization of humans.However,due to the huge amount of image data,its storage and transmission cost too much.Therefore,image coding becomes the foundation of the entire information utilization process.This thesis mainly focuses on the image coding area.Image coding is a complex system,including many modules,among which the transform module plays an important role.At present,wavelet transforms have been widely used in image coding area,such as that the famous image coding standard JPEG2000 is built upon the wavelet transform.However,the existing wavelet transformbased image coding methods perform much poor than other advanced image coding methods.This is mainly due to three reasons.(1)The traditional wavelet transforms use hand-crafted simple wavelet basis,which is inefficient when processing natural images and will affect the compression performance when applied to natural image coding.(2)The traditional wavelet transform-based image coding methods optimize each module individually,which easily leads to local minimal and limits the compression performance.(3)The traditional wavelet transform-based image coding methods are mostly optimized for signal fidelity rather than visual quality.It can be seen that the above three problems exist in three progressive levels,namely,the design of the model,the optimization of the model,and the optimization objective of the model.To address these three problems,we have made the following contributions in this thesis:(1)A learnable wavelet-like transform is proposed.In order to obtain a wavelet transform that is more suitable for processing natural images,the convolutional neural networks(CNNs)are introduced into the lifting scheme to replace its linear filters,leading to a learnable wavelet-like transform named iWave.Then an autoencoder-like structure is proposed to optimize iWave for better energy compactness for natural images.To apply it into JPEG2000,a normalization method is proposed to scale the transform coefficients produced by iWave.Experimental results demonstrate that iWave is more efficient than traditional wavelet transforms,in terms of energy compactness and compression performance.(2)An end-to-end optimization method is proposed for training the wavelet-like transform-based image coding method.To achieve this,based on iWave,a CNN-based entropy coding module and a de-quantization module are additionally designed,forming an end-to-end image coding method named iWave++.Different optimization strategies are designed for training iWave++for lossy compression,for lossless compression,and for universal compression,respectively.Experimental results demonstrate that iWave++has better compression performance than JPEG2000 and has better generality than other end-to-end image coding methods.(3)A visual quality-oriented optimization objective is proposed for training the wavelet-like transform-based image coding method.First,a wavelet-like transformbased probabilistic decoding framework is constructed and proven to be effective for optimizing both signal fidelity and visual quality.Then based on the proposed probabilistic decoding framework,a new visual quality-oriented optimization objective is proposed for end-to-end training.In order to optimize for the proposed objective,two training methods are proposed.One is based on the Kullback-Leibler divergence,and the other is based on the Wasserstein distance.For the latter,a rectified Wasserstein generative adversarial network is proposed to help to minimize the Wasserstein distance.Experimental results indicate that training for the proposed objective can lead to controllable probabilistic decoding,and can improve the visual quality of the reconstructed images while maintaining the ability to reconstruct images with good signal fidelity.Based on the above works,a novel wavelet-like transform-based image coding method is proposed in this thesis.It not only has better compression performance than traditional wavelet transform-based image coding methods,but also is competitive to other advanced image coding methods.Our work not only reveals the great potential of wavelet transform when it applied to image coding,but also has implications for both end-to-end image coding methods and visual quality-oriented image coding methods.
Keywords/Search Tags:deep learning, image coding, rate-distortion optimization, wavelet transform
PDF Full Text Request
Related items