| Depth estimation can provide 3D scene information for face recognition,3D reconstruction,and autonomous driving.Compared with traditional imaging systems,light-field imaging records 4D information about the brightness and angle of light,and implies multiple depth cues.In recent years,significant progress has been made in light field depth estimation based on deep learning.But most methods focus on extracting depth cues from a single representation of the light field,and the depth information contained in a single cues is incomplete.For example,EPI cues can effectively solve occlusion and no texture problems,but its powerless in dealing with noise problems;Although focus cues can efficiently deal with noise problems,but there is a limitation on the occlusion process.This paper studies the role of various cues in light field in depth estimation,and combines the advantages of deep learning to give full play to the complementary characteristics of multiple cues,and then proposes a light field depth estimation model "FusionNet" based on multi-cue fusion learning.The model includes three sub-networks:EPIs Pathway based on EPI images,Defocus Pathway based on focus cues,and Structure Pathway based on central sub-aperture cues.It is the focus of this paper to study the complementary characteristics of multiple cues in depth estimation.The main research contents of this article are as follows:1.Based on the depth estimation method of multi-directional EPI image blocks,in order to solve the problem that EPI depth cues are sensitive to noise,this paper proposes a focus sub-network based on refocus cues.By introducing refocusing cues,the problem of poor anti-noise capability of the EPI sub-network is improved.2.Aiming at the situation that the focus sub-network based on the refocusing cues has poor prediction effect in occlusion and edge regions,this paper introduces the image structure characteristics of the central sub-aperture and designs an image structure sub-network based on the central sub-aperture cues.The network improves the problem of poor edge prediction.3.By analyzing the sub-networks of three single cue,this paper proposes a multi-cue fusion strategy to give full play to their respective advantages.The fusion adopts a multi-layer fusion method to realize the mixed learning of the three sub-networks,so that the fusion model FusionNet can obtain satisfactory results in handling occlusion,edges,and noise problems.The FusionNet proposed in this paper has achieved a significant improvement in the accuracy of the light field depth estimation.The algorithm has been submitted to the HCI 4D light field depth estimation evaluation website.In the high-precision evaluation of BadPix0.03 and BadPix0.01,FusionNet ranked No.1.It proves the superiority of the algorithm in this paper. |