| Monocular depth estimation is a technique for predicting scene depth information from a single two-dimensional image,which is a classic challenge in the field of computer vision.However,due to its cost advantages and flexible deployment,it has broad prospects for development.In recent years,with the rapid development of deep neural networks,research on monocular depth estimation based on depth learning has focused on regression of depth through coder decoder structures,and significant achievements have been achieved.However,the traditional decoding process usually only involves simple upsampling and convolution operations,and there is a problem that it cannot fully utilize well coded features for monocular depth estimation.Therefore,this paper first proposes a Laplacian pyramid monocular depth estimation model incorporating attention mechanisms to obtain high-precision depth information in images.Subsequently,this paper combines the target detection algorithm and applies it to forward vehicle distance detection to carry out practical application research of the model.The research content of the paper mainly includes the following two parts:1.Propose a Laplacian pyramid monocular depth estimation algorithm incorporating attention mechanisms.The Laplacian pyramid can capture the detailed features of images on different scales,and the algorithm proposed in this paper inherits this advantage.Under the premise of controlling the complexity of the model,the Coordinate Attention(CA)is integrated to upsample the feature maps of each level of the encoder;In the decoding process,a Shuffle Attention(SA)and a CA are added to guide depth estimation using coding features of each scale,appropriately emphasizing the depth characteristics of coding features of each scale,and retaining more local information;By combining data loss and structural similarity loss,the stability and convergence speed of model training are improved,while the training cost is reduced.2.Construct a forward vehicle distance detection algorithm based on depth estimation and target detection.Due to the high real-time requirements of the network for forward vehicle distance detection,this paper adopts a strategy of extracting key frames for forward vehicle detection and depth estimation.At the same time,in order to overcome the inefficiency of general target detection methods in extracting features,this paper constructs a vehicle detection model based on YOLOv5,and proposes to replace the SPP module of the original Backbone network with the SPP-ASPP module,which expands the receptive field of the model without reducing the resolution of the feature map;In addition,the attention mechanism is introduced in the Neck section to enhance the information fusion of semantic features and positional features;Finally,the outputs of the target detection and depth estimation networks are used as inputs to the target key point detection algorithm.In order to eliminate the interference of non vehicle pixels within the target frame,this paper adopts the method of taking the local depth average of the target frame to achieve 3D key point fitting of the target.Experimental results show that the Laplacian pyramid monocular depth estimation algorithm incorporating attention mechanism on KITTI dataset not only improves the depth estimation accuracy of the model,but also effectively reduces the root mean square error and training cost.The improved YOLOv5 model not only improves realtime performance to a certain extent,but also maintains a high accuracy rate.The accuracy of forward vehicle distance detection using optimized key point detection algorithms has significantly improved. |