| Object detection based on multi-sensor fusion helps to understand the environmental surroundings of the vehicle under all-weather and all working conditions.Through redundant and complementary environmental information from sensors,object detection provides intelligent vehicles with the motion state and geometric properties of special targets in the traffic.Because of the different data structures and different noise types of the sensor raw data,it is not easy to make full use of the sensor data to achieve stable and robust object detection in complex traffic environments.In the calibration of vehicle-road collaborative sensing system,there is relatively little research on the overall joint calibration of vehicle-mounted multi-sensors,and the existing complex geometric constraints easily lead to sub-optimal calibration results.Moreover,the large difference in the sensor conformation of road-side system results in poor compatibility with the joint calibration methods of vehicle-mounted multisensors.In the object detection of vehicle-mounted multi-sensor fusion,the image pixel registration of low-cost RGB cameras and far-infrared cameras has great difficulties due to the lack of depth information.Moreover,there is not enough theory for the unified description of traffic environment and decoupled data fusion of multi-sensor perception system.In response to these problems,this paper optimizes and innovates the object detection system based on multi-sensor fusion in low-speed scenarios based on intelligent driving-related projects.The main contributions include the following parts:Firstly,in order to increase the research of multi-sensor joint calibration,this paper reasonably constructs the geometric correlation information between heterogeneous sensor and converts it into a nonlinear optimization loss function,and makes full use of the extrinsic calibration algorithm to complete the joint calibration of multi-sensor in vehicle system and roadside perception system.In the actual projects,based on the perception architecture of the intelligent sweeper,a circular calibration plate with a chessboard pattern on the surface is designed,and the circle center of the calibration plate is used as a geometric constraint to complete the joint calibration of dual lidar and monocular camera.Moreover,panoramic image stitching of dual fish-eye cameras assisted by Li DAR establishes common geometric constraints between the three sensors,and simultaneously completes joint calibration,panoramic image stitching and spatial alignment of panoramic images and Li DAR clouds.In addition,2D Radar data is reasonably converted into the nonlinear optimization and reprojection error loss function to complete the joint calibration of roadside Radar,roadside camera and roadside Li DAR.Secondly,this paper develops a fusion architecture of low-cost RGB camera and infrared camera for 3D pedestrian detection in daytime and low-light conditions,in order to reduce safety accidents in the process of cooperation between agricultural machinery and workers.Based on the Field SAFE dataset,a semi-automatic labeling scheme is designed to generate the 3D cylindrical labels of pedestrians,and a 3D pedestrian detection dataset Field Safe Pedestrian is proposed in the farm environment.On the other hand,this paper proposes a depth-guided registration method to complete dynamic alignment between RGB and infrared images.Through the feature shifting and feature pooling,the image features of heterogeneous sensors are further fused to complete robust 3D pedestrian detection.Thirdly,this paper proposes a general framework of decoupled data fusion for multi-sensor perception systems,which enables the input data of different modalities to be decoupled from each other in the data fusion framework.In the fusion experiments,this paper firstly verifies the feasibility of 3D object detection based on single-modal Radar,and then makes full use of the spatial attention mechanism module to achieve decoupled data fusion in the data fusion of Li DAR and Radar.When one sensor fails,the other module data can obtain 100% weight based on the reasonable framework structure,so as to better complete the object detection task,and truly realize the decoupling between multi-modal data.Our method also compares with other different fusion methods and random sensor failure training methods,which fully proves the advantages of our theory.Finally,this paper further explores the 3D space conversion of image features.After converting the image features to the BEV view,image semantic information could be further fused with Li DAR and Radar to form decoupled fusion of camera and Li DAR,and to complete panoramic data fusion in object detection system.In the fusion process,the experiment first verifies the feasibility of the decoupled data fusion theory in the fusion of camera and Li DAR data,and then realizes the panoramic object detection based on the multi-view cameras,Li DAR and Radar.The experimental results show that the decoupled data fusion of camera and Li DAR can achieve more accurate detection results in the case of dual-module data input,while considering sensor failure,the case of single-modal data input can remain adequate performance.Moreover,after fusing the feature information of panoramic images,the fusion network can achieve more accurate object detection performance,thus forming a multi-sensor fusion panoramic object detection. |