| Depth estimation is a crucial upstream task in the field of vision.In the traditional method,the depth is basically obtained by the depth camera,and its estimation problem is often solved by the structure from motion method and the multi-view stereo method.In re-cent years,with the dramatic development of deep learning,the dependence of traditional algorithms on depth cameras and the efficiency problems caused by a large number of calculations in multi-view algorithms have been optimized.However,the existing depth estimation methods based on deep learning lack of solutions to the problem of insufficient features of low-texture regions in specific problems,which is difficult to deal with the problem of depth ambiguity in the absence of global information.At the same time,how to better combine various depth estimation algorithms to learn from each other and pro-vide new solutions for occlusion and glossy surface has become the core issue of making full use of supervision information.To solve the problem of difficult feature acquisition in low texture regions,this paper designs a depth estimation algorithm based on semantic segmentation and neural radiation fields.The algorithm uses multi-view stereo tools and semantic segmentation network to fit and optimize the scene of the pre-trained lightweight network.Then,the algorithm uses the asymmetric sampling method to sample and guide the implicit neural radiance field,and uses the RGB image and semantic segmentation image obtained by volume rendering for supervision.After obtaining the rendered depth map,the algorithm finally uses plane bilateral filtering to do smooth operation and obtain the final depth result.The experiments show that the algorithm performs well in the experimental dataset.The depth prediction results in the large low-texture areas and the intervals between the foreground and background have been significantly improved,providing a fresh perspective for self-supervised depth estimation problem.In order to make better use of different depth estimation algorithms and avoid the limitations of self-supervised methods in various complex environments,this paper pro-poses a depth estimation algorithm based on iterative fusion of monocular and stereo vi-sion.Firstly,the algorithm uses the monocular depth prediction network to obtain the depth probability distribution;Then the multi-view optimization iteration module uses the depth probability distribution for depth sampling,and calculates the multi-view con-sistency matching weight based on stereo vision method.After updating the depth prob-ability distribution through the weight network,it enters the iteration process,constantly approaches the ground truth depth;Finally,the context fusion module uses depth informa-tion of different scales to fuse,making the distribution of the aforementioned depth results more reasonable and smooth.The three modules of the algorithm complement each other.The monocular depth prediction module improves the interference of the multi-view algo-rithm in terms of low-texture,specular areas and occlusion.The multi-view optimization iteration module avoids the inherent ambiguity of monocular depth prediction.The con-text fusion module extends the depth learning beyond single pixel,making full use of the full image information.The experiment shows that the algorithm performs well on differ-ent datasets,and each module contributes,making full use of the supervision information. |