| In the real world,humans directly recognize and understand various scenes based on three-dimensional(3D)information,and the appearance of images provides another way to perceive the world.Unlike single image that only records two-dimensional information,light field image contains 3D information in space,which can be used to reconstruct the 3D scene,and further presenting the 3D scene on a three-dimensional display device can provide immersive viewing experience for viewers.However,limited by expensive acquisition equipment and complex application scenarios,current light field images usually cannot meet the requirements of 3D display devices in terms of spatial or angular resolution.Therefore,the light field images need to be post-processed by super-resolution(SR)reconstruction algorithms to provide high-quality display content for 3D display devices.Although in recent years,the research on image super-resolution reconstruction has achieved remarkable achievements with the advancement of deep learning.Nevertheless,these methods have problems such as poor transfer effect,strong label dependence and low computational efficiency.These shortcomings limit the development of 3D displays,especially the content generation of dynamic 3D displays.To address the challenging problems of SR reconstruction in 3D display,this dissertation conducts in-depth research based on 3D light field images,and proposes a series of efficient 3D display content generation methods from both image-based and model-based perspectives,which provide solutions for dynamic 3D display.The research content and main innovations of this paper are summarized as follows:(1)Light field image spatial super-resolution reconstruction method based on raw data with transformersTo solve the problem of poor reconstruction quality of existing light field image spatial SR algorithms in real scene data,a light field image spatial super-resolution reconstruction method based on raw images and attention mechanism is proposed.Different from existing methods that use low-resolution color images as input,this method utilizes the raw image without image signal processing as input,which improves the fidelity of information and provides more effective information for spatial SR reconstruction.In addition,this method uses an end-to-end deep learning network to fuse and restore the raw data.The network is specifically divided into two modules.The first module is the aggregation module.Firstly,the light field image is shifted and concatenated to construct a plane sweeping volume(PSV)representing the refocusing results of different depths.This operation of depth layering is benefit to express the structural information of the scene.Then the volume attention mechanism is introduced to aggregate the PSV of different depths and regress the central view image based on the depth probability distribution,and the aggregated central view image can be reconstructed.The second module is the refinement module,which uses the center view image reconstructed by the first module as a reference image to reconstruct the edge view images.To align the spatial position of the center and edge views,this module uses a cross-view attention mechanism to match the non-local information of center and edge images,and fuses the center view information extracted by the attention mechanism into the edge view to achieve image reconstruction.The experimental results show that our proposed light field image spatial SR reconstruction method based on the raw image and attention mechanism can improve the restoration quality of the real-scene image,especially for edge details and scene structure.This method is evaluated on several public light field datasets and our collected light field datasets.The results show that this method has good generalization ability in different scenarios.Compared with the previous methods,the result of this method can improve the peak signal-to-noise ratio(PSNR)index by 1.5dB.In the test of 3D light field display,combined with the human eye tracking mode in the display device,the aggregation module in this method can perform real-time SR reconstruction for the specifical view image that the viewer is watching,which improves the visual effect in dynamic 3D display.(2)Real-time dense-view image synthesis method based on light field image color calibration and self-supervised disparity estimationTo solve the problem that the number of images on real scene light field dataset is difficult to match the 3D display equipment,an angular SR reconstruction method based on color correction and disparity map estimation is proposed.Due to the physical size and internal structure of the camera array,the real scene light field image usually has problems such as color difference and large baseline,which cannot be directly applied to 3D light field display.To address the issues,this method first models the imaging process and proposes to use 3×3 color correction matrix to represent the color difference among different cameras,and introduces a fully connected network to predict the color correction matrix for color correction of the light field image.In addition,this method proposes an auto-encoder network model to estimate the disparity of adjacent view images.This model is composed of a single image feature encoder and four decoders with different resolutions.Specifically,the decoder outputs coarse to fine disparity maps with different resolutions.Since the camera array can only collect color images,the label of disparity maps cannot be provided in this dataset for supervision of the estimated disparity maps.To this end,this method adopts a self-supervised learning method.The specific operation is to apply the estimated disparity map to the right-view color image using image warping,and obtain the estimated left-view image.This image can be supervised with the input left-view image,thus the disparity map can be optimized.Finally,the estimated disparity map can be used to synthesize novel view images between left and right views,and the angular resolution of light field images can be improved.The experimental results show that the angular SR reconstruction method based on color correction and disparity map estimation can reconstruct HR real scene light field images for 3D display.In terms of color correction evaluation,the color correction matrix of our method presents smaller image color error than traditional methods with an average error of 0.0143;In addition,the color corrected image can improve the effect of disparity estimation.In the angular resolution reconstruction evaluation,the self-supervised learning mode proposed in this method can generate accurate disparity map,and the PSNR of novel view images can be over 30dB,which is significantly higher than other self-supervised learning methods.In terms of computional performance,this method can achieve 6 times angular SR reconstruction around 25FPS at 1024×512 resolution,and the reconstructed images present dynamic display on 3D light field display device at 7680×4320 resolution.(3)Real-time light-field encoded image generation method based on pathtracing and CNN super-resolutionThe multiview images generated based on image methods require further synthesis of light field encoded images before it can be used for 3D display.This process does not require all the information of these images,hence there is redundancy in the multiview images used for 3D display.In order to improve the efficiency of light field encoded image generation,a real-time light field encoded image generation method based on path tracing and angular super-resolution reconstruction is proposed.Although model-based rendering methods generate light field encoded images more efficiently than image-based methods,the speed of existing rendering methods still cannot meet the requirements of high-resolution(HR)dynamic 3D display.Therefore,the two-stage mode is used to combine the path tracing rendering technology with the angular super-resolution algorithm.The path tracing technology is used to provide low resolution light field encoded images,and the angular super-resolution algorithm is used to improve the resolution of light field encoded images.In the first stage,the path tracing algorithm is used to render the light field encoded image of the sparse view point to reduce the rendering time;in the second stage,this method proposes a lightweight angular SR model based on generative adversarial network.In the training stage,the generative network module is used to perform angular super-resolution reconstruction,and the discriminative network module is used to perform additional supervision on the reconstruction results to improve the reality of the reconstructed images.In the test stage,only generative network module is used to reconstruct images for computation time reduction.In addition,this method can optionally reconstruct the foreground area of the image to further reduce the computational time.The experimental results show that the light field image generation method based on path tracing and angular super-resolution reconstruction not only completes real-time rendering of high-resolution images,but also outputs high-quality light field encoded images.In the rendering pipeline design,the angular SR module is embeded into the path tracing rendering pipeline,so that the overall rendering process is completed in the graphics processing unit(GPU),which avoids additional data communication.And the light field encoded image at 7680×4320 resolution can be generated over 30FPS.Moreover,in the evaluation of angular SR reconstruction,the proposed method can achieve the best restoration of scene structure compared with the traditional super-resolution reconstruction methods under the similar calculation.And the structural similarity of the reconstructed images can be stabilized above 0.90 compared with the HR image based on path tracing rendering. |