Font Size: a A A

Research On Multi-Sensor 3D Object Detection Algorithms In Complex Road Scenarios

Posted on:2024-05-03Degree:MasterType:Thesis
Country:ChinaCandidate:R Z SongFull Text:PDF
GTID:2542307064495074Subject:Engineering
Abstract/Summary:PDF Full Text Request
Perception system is an important part of intelligent vehicles.Comprehensive and accurate perception of the vehicle surrounding is the guarantee of driving safety.3D object detection is a crucial task in the perception,which includes obtain the size,location,direction and category information of the object.It provides data for planning,decision-making and control.With the deepening of automated driving,the perception scenarios become more and more complex.In order to improve the richness and accuracy of perception,intelligent vehicles are equipped with a variety of sensors,among which LIDAR and cameras are widely used in 3D object detection due to their complementary data.However,the current object detection algorithms do not achieve the accuracy and reliability required for automated driving in complex scenarios,and their data fusion degree and detection accuracy still need to be improved.This paper focuses on feature extraction,feature fusion and object detection,and the main research contents are as follows.1.Spatiotemporal calibration of monocular camera and LIDAR.Aiming at the problems of heterogeneous sensor coordinate system conversion and data synchronization,we analyze the principle and model of joint calibration of camera and LIDAR,solve the internal and external parameters of LIDAR and camera to complete the alignment between point cloud coordinate system and image coordinate system,and unify the clock source and synchronize the frequency between the camera and LIDAR to achieve synchronized output between images and point clouds.2.Research on fusion algorithm based on cross-attention.To address the problem of rough fusion of point clouds and image features in existing methods,this paper proposes a fusion algorithm based on cross-attention mechanism.The method uses the cross-attention mechanism to dynamically capture the correlation between image features and point cloud features and aggregates the image features based on the weight matrix.It avoids the direct splicing or summing of different modal data,which achieves high-quality fusion of images and point clouds and effectively reduces the impact of data differences on the model.Experimenting on the KITTI dataset based on the same detection model,the algorithm can increase the multi-class average precision by 0.83% compared to the feature concatenation fusion strategy,effectively improving the object detection performance.3.Research on dynamic sparse convolution strategy.A dynamic sparse convolution algorithm is proposed to deal with the problem of computationally intensive and information loss in existing point cloud feature extraction module.The output shape of dynamic convolution is variable.During the convolution process,dynamic convolution predicts and extracts relatively important foreground features.Therefore,it effectively alleviating the problem of losing or diluting important features by sparse convolution.In comparison experiments with existing sparse convolution methods in KITTI dataset,this method can improve pedestrian detection precision by 2.71% for medium difficulty levels,effectively improving the detection precision of small objects.4.Research on two-stage 3D object detection algorithm based on voxel feature and image feature fusion.Aiming at the problem that the existing methods have poor accuracy in detecting and locating small and distant targets in complex road scenes,a two-stage detection network is proposed.A multi-layer perceptual field module is added to the image branch for multi-scale aggregation of local image features.The voxel pooling module is added at the end of the model to sample and aggregate the uncompressed 3D voxel features.At the same time,a two-stage 3D object detection algorithm based on the fusion of voxel features and image features is designed by integrating the feature extraction and fusion methods from previous sections.The algorithm was evaluated on a publicly available dataset and tested in real-world driving scenarios.Experiments on the KITTI dataset show that the 3D object detection algorithm proposed in this paper improves significantly in terms of accuracy compared with the mainstream algorithms.
Keywords/Search Tags:Automated Driving, Fusion Perception, 3D Object Detection, Cross-attention, Sparse Convolution
PDF Full Text Request
Related items