| Depth estimation has been gotten great attention to in computer vision for a long time.So far,it is still a subject to be further studied.Because depth estimation is different from other computer vision algorithms,such as object detection,face recognition,etc.,it only needs to process the two-dimensional information in the image.At present,there are many methods to complete the task of depth estimation.In this paper,we use the parallax of the left and right views to train the end-to-end unsupervised neural network.And finally,our model can get the depth image of the image through a single photo.In this paper,we innovatively propose a new ides which combine the semantic segmentation technology with depth estimation algorithm,effectively improves the accuracy of the model.Our model consists of encoder and decoder.The encoder extracts the features with dilated convolution,increase the receptive field,reduce the image compression degree,and make the depth map more clear and accurate.The main work and innovative results of this paper are as follows:(1)The use of binocular depth estimation algorithm matching principle monocular model design,and try to use different convolution kernels receptive field size of the model feature extraction.In the fourth part of the article,the Deep Lab model is used to extract features,and the dilated convolution is used to increase the receptive field to optimize the depth estimation.The experiment proves that using hole convolution can more effectively extract the salient features of the image,thereby improving the experimental effect.(2)In the fifth part,the dense connection model is used innovatively,which is combined with the larger dilated convolution of the receptive field,and some oversampling is avoided.This method can reduce the number of parameters and get better results by using features repeatedly.In the decoder stage,deconvolution and up sampling are used to recover the image size and jump connect with the underlying features extracted in the encoder stage,so as to reduce the information loss of the image and make the generated depth map more complete and clear.In the process of training,we use the unlabeled data of Kitti and Cityscape datasets to conduct unsupervised endto-end training,and use the same evaluation criteria as the existing algorithm,and obtain the best current results. |