| Autonomous driving perception systems using monocular cameras can accurately detect targets,but the lack of scene depth information makes it difficult to determine the location and scale of the target.Image-based convolutional neural networks can significantly improve the accuracy of depth estimation,but a single image by itself cannot provide an accurate reference point about depth,making it difficult to solve the ill-posedness problem of monocular depth estimation.Existing studies often use Li DAR point clouds to provide depth reference points for the scene,but Li DAR is weather sensitive and has high computational power requirements,making it difficult to meet the needs of real-time systems.Considering that radar is an all-weather sensor and has low hardware cost,this paper proposes to research on dense depth estimation and object detection methods based on radar and vision to improve the target recognition accuracy and target position estimation precision of autonomous driving perception systems,so as to meet the needs of autonomous driving perception systems.The main research includes:(1)Dense depth estimation algorithm based on vision and radar.Compared with Li DAR,radar point clouds are extremely sparse,providing only about 0.003% of the a priori depth value per image frame.To address the challenge of obtaining sufficient depth estimation accuracy under sparse point cloud conditions,a two-stage depth estimation algorithm is proposed in this paper.A sparse pre-mapping module is used in the depth estimation branch to initially extract sparse features.Through the feature fusion module,a channel attention mechanism is introduced to guide the second stage feature encoding using the first stage decoder features containing higher level semantic features.The semantic information is also introduced to improve the loss function by considering the driving scene characteristics,resulting in a more accurate depth map.(2)Fusion object detection algorithm based on radar and vision.This paper proposes a method to improve object detection accuracy by enhancing image features based on the idea that radar information complements missing information in images.Two transformer architectures are proposed for the fusion of radar and image features,and the radar representation,fusion stages and fusion methods are investigated.The decoder structure for aggregating multi-scale features is further proposed to address the problem of "single-size features may lead to false detection and missed detection".(3)Fusion target pose estimation algorithm based on dense depth estimation and vision.In this paper,based on the above two research content,the dense depth map is used to obtain the target’s positional information.Based on the velocity information of the radar point cloud,a velocity extraction module is constructed to help the network return more accurate velocity information to support the autonomous driving decision algorithm.The proposed algorithm performs depth estimation of all pixels in an image within 100 m with an average error reduction of 0.518 m compared to the existing optimal algorithm in the depth estimation task,and improves the AP value by 3.7%compared to the existing optimal algorithm in the target detection task.Based on this,the target pose estimation task is completed and the target pose estimation accuracy is improved,thus building a reliable sensing module for autonomous driving sensing systems using millimetre wave radar and cameras. |