| In recent years,with the rapid development of autonomous driving,robotics and other fields,Lidar has become a core sensor for environment perception.Point cloud obtained by Lidar with centimeter-level position information is an important source of information for high-precision positioning.As one of the key technologies of environment perception,object detection of point cloud based on deep learning has extremely important research and application value.This thesis focuses on the whole process of object detection algorithm of point cloud,and down-sampling algorithm of point cloud,aggregation method of local feature,and design of detection head are studied in sequence.Specifically,the main research contents can be summarized as follows:(1)Fusion down-sampling method based on semantics perception.The down-sampling algorithm of point cloud is used to represent the original point set by sampling part of the point set in the hierarchical structure,which can be divided into heuristic method and learning-based method.In the object detection task from point cloud,the current learning-based sampling method is computationally complex,while the heuristic sampling method is not a task-oriented strategy.Considering the importance of foreground points in bounding-box regression and avoiding the redundancy of foreground points from the instance in the sampling process,this thesis combines heuristic sampling method with learnable strategy,and designs a semantic weighted farthest point sampling algorithm.This algorithm can preserve foreground points well and realize relatively uniform downsampling.(2)Local feature aggregation method based on point attention.In order to expand the receptive field in the hierarchical network,the point sampled is generally taken as the center to extract local features of point cloud.This thesis designs a local feature aggregation module composed of spatial position encoder,point attention and maximum pooling operation.Spatial position encoder is used to enhance the position relationship of point sets in local region,point attention is used to learn the relative importance of point features in space and channels,and maximum pooling operation is used to aggregate local features.This module enhances the network’s ability to extract local features and is a lightweight module with less additional computation.(3)Design of detection head based on IoU-aware.In object detection tasks,the detection head is generally used for object classification and position regression.At present,the detection head generally has problems such as inconsistency between classification confidence and localization accuracy,and inconsistency between IoU evaluation criteria and loss.For the above two inconsistencies,this thesis designs an IoU-aware detection head.The IoU-based post-processing method utilizes the predicted IoU to alignment classification confidence to obtain a modified confidence of IoU.Then,the modified confidence is served as the ranking criterion of non-maximum suppression algorithm,which restrains the possibility of preserving the boundary box with low localization accuracy.Moreover,a three-dimensional IoU loss with rotation angle is used to unify the independent parameters as a whole in the regression task and the bounding box regression is guided by the IoU-based regression optimization strategy.The detection head based on IoU-aware improves the localization accuracy of the predicted boxes. |