Font Size: a A A

Research On 3D Environment Perception Algorithm Based On Multi-modal Sensor Fusion

Posted on:2024-03-16Degree:MasterType:Thesis
Country:ChinaCandidate:H QiFull Text:PDF
GTID:2568307115977829Subject:Mechanical engineering
Abstract/Summary:PDF Full Text Request
The safety issues of autonomous driving involve multiple components,among which the perception technology of the driving environment is a critical issue for intelligent vehicles.It can form a closed-loop autonomous driving system with autonomous navigation,path planning,and decision control technologies.3D object detection technology is the basic task of environmental perception,aimed at identifying objects of interest in sensor data and determining their position and category.To achieve stronger perception and more reliable object detection,autonomous driving vehicles are equipped with both camera sensors and Lidar sensors.The information fusion method of multimodal sensors is used to achieve more comprehensive perception of 3D objects,thus meeting the safety and accuracy requirements of the autonomous driving systemBased on this,this thesis explores information fusion methods and technologies based on Lidar and camera,focusing on the multimodal spatial fusion mode and high-quality multimodal data enhancement technology.To overcome problems such as geometric distortion,information loss,and feature confusion caused by heterogeneous multimodal sensors,a two-stage sequence fusion framework is constructed.The nearest group association method is designed,and cascade confidence and distance are used to complete non-maximum suppression jointly.Based on the context information of images and point clouds,multimodal sample pasting is implemented to enrich and enhance the autonomous driving training dataset.This provides new ideas for the large-scale,industrial,and scaled applications of 3D environmental perception technology in high-precision and highefficiency fields such as autonomous driving and mobile robots.The main research contents are as follows:(1)Study on Intelligent Vehicle Perception System and Multi-sensor Calibration MethodWe built an intelligent vehicle perception system with a three-dimensional laser radar and a monocular camera as sensing devices.We analyzed the theory of camera pinhole imaging and nonlinear imaging models,established the conversion relationships among the vehicle coordinate system,camera coordinate system,laser radar coordinate system,and image coordinate system,and carried out independent and joint calibration experiments for heterogeneous sensors.Leveraging tools such as Python,Opencv,and Autoware,we solved the camera’s internal parameter matrix and the camera-laser radar external parameters,unified each sensor’s coordinate system,and constructed a mapping model,providing guidance for subsequent space fusion algorithms.(2)Study on Three-Dimensional Object Detection Method Based on Two-Stage Sequence FusionThe application of three-dimensional object detection technology that combines laser radar and camera fusion has made significant progress in the field of autonomous driving.However,the challenge lies in designing effective positions and strategies for multi-modal fusion.Therefore,this paper proposes a two-stage sequence fusion based three-dimensional object detection method.In the first fusion stage,we fuse the original point cloud with the image instance segmentation mask to generate a semantically rich reinforced point cloud,and introduce the idea of nearest group association to reduce the impact of noisy point clouds on network training.In the second fusion stage,non-maximum suppression is accomplished by cascading anchor point spacing and confidence to obtain more accurate candidate regions.A large number of experiments were carried out on the KITTI dataset,and the proposed algorithm achieved superior average detection accuracy.We also conducted ablation experiments to validate the effectiveness of each module.(3)Research on Multi-Modal Data Augmentation Based on Contextual InformationData augmentation has been widely applied in three-dimensional point cloud and twodimensional image object detection networks.However,existing multi-modal data augmentation methods simply provide a basic reference for single-modal work,and the challenge lies in ensuring the consistency and rationality of image and point cloud samples when pasting.This paper proposes a multi-modal data augmentation method based on contextual information to generate content-rich training scenes for autonomous driving.Firstly,we construct a realistic sample database and a scene ground database based on the original training set,and then use image and point cloud contextual information to guide the positioning of real samples when pasting.We evaluated the effectiveness of this data augmentation method for three-dimensional multi-modal detectors on the KITTI dataset,which outperforms existing data augmentation methods.Ablation experiments also demonstrate that contextual information can provide more valuable features for network training.
Keywords/Search Tags:Autonomous driving, Environmental perception, Multimodal sensor fusion, 3D object detection, Multimodal data augmentation
PDF Full Text Request
Related items