| Focal stack images are a sequence of 2D images focusing on different depths of scenes.The depth information contained in focal stack images plays a vital role in 3D display,microscopic imaging,immersive multimedia and related fields.However,while providing the comprehensive description and presentation of a scene,focal stack images hit a bottleneck of its applications due to dense sampling,redundant representation and huge data volume.It brings great challenges to data compression,storage,transmission and application.Aiming at the challenges,this paper proposes specialized representation models to efficiently represent single view and multiview focal stack images,and then designs coding methods based on the representation model.Moreover,the effect of representation coding on vision applications and the advantages of focal stack data type in vision applications are studied.The main research contents of this paper are as follows:(1)Gaussian representation based single view focal stack image coding methods are proposed.A Gaussian representation model is proposed to represent defocused blurred image blocks by focused blocks.A unidirectional inter frame prediction mechanism is constructed and fused into the existing encoder framework.The fusion coding method solves the problem that the existing coding tools are not suitable for focal stack image.For meeting the low coding complexity requirements,a low complexity Gaussian basis coding method is proposed.This method does not need to compress the whole focal stack sequences,only a small number of selected basis blocks and corresponding parameters are compressed.The complexity is reduced by 5% and 86% in LDP(low delay P)configuration compared with the benchmark method and the comparison method,respectively.It also obtains the average PSNR(peak signal-to-noise ratio)gains of 2.88 d B.Finally,the effect of focal stack image representation and coding on vision application is studied.The frame interpolation error of Gaussian representation based method is the 68% of the comparison method,and the light field reconstruction quality can be improved by 0.26 d B.(2)Basis-quadtree representation based single view focal stack images coding method is proposed.To slove the problem of efficient coding of single view focal stack images,a basis-quadtree representation model is proposed to represent the focusing changes between frames of the focal stack images.A bidirectional inter frame prediction mechanism is constructed,which adopts Gaussian filter and Wiener filter to approximate the defocused and focused image blocks from basis blocks,respectively.By solving optimization problem,the basis blocks,quadtree partitions and prediction parameters can be selected.Based on the model and mechanism,an efficient coding method is proposed to eliminate the redundancies of single view focal stack images.The experimental results show that it obtains as high as 71.59% bitrate savings and 5.23 d B PSNR gains in LDP configuration,achieving high compression ratio and high-performance coding.(3)A multiview focal stack images coding method based on parallax focalinconsistency representation is proposed.A multiple types and multiple fields dataset is made to solve the problem of lack of multiview focal stack images.Aiming at the coding problem of multiview focal stack image,a parallax focal-inconsistency representation model is proposed to describe the special data redundancy in multiple dimensions such as spatial,angular and focal dimensions.A multi-dimensional inter frame prediction mechanism is constructed,thus the parallax focal-inconsistency correlations between views can be evaluated by motion estimation and focusing approximation.An efficient coding method for multiview focal stack images is designed.The method can obtain up to 39.42%bitrate savings and 1.74 d B PSNR gains,achieving twice average performance of the comparison method.Finally,verification experiments show that multiview focal stack images can synthesize new views with greater depth of field and reconstruct 3D surface with higher quality,having the advantage of providing rich scene focusing information for vision applications. |