With the rapid development of autonomous driving technology,vehicle-mounted Li DAR has become one of the key components of vehicle sensors,and the processing of the three-dimensional point cloud data it collects has become a crucial step in improving the vehicle’s perception of the surrounding environment.Among the numerous point cloud processing tasks,3D point cloud shape classification and semantic segmentation are currently two popular research directions.In recent years,although many researchers have proposed various solutions for these two point cloud processing tasks,there is still a need to further improve the feature extraction ability of point cloud networks.This thesis aims to explore point cloud processing techniques based on deep learning,and conduct in-depth research on 3D point cloud shape classification and semantic segmentation,combining attention mechanism and multi-scale feature fusion technology,in order to obtain better results.The main work of this thesis is as follows:(1)Using traditional clustering-based point cloud segmentation methods,each frame of point cloud in the KITTI and DAIR-V2 X datasets was segmented into clusters,and the point clouds of each cluster were obtained.Then,with the aid of annotated information provided in the dataset,a real-world road scene point cloud classification dataset was constructed.(2)In order to enhance the feature extraction capability of networks,this thesis proposes a dual-attention module.The module consists of two cascaded sub-modules:one is a multi-angle and multi-level channel attention sub-module that globally influences different levels of features from various angles,automatically selects taskrelevant features,and suppresses irrelevant features;the other is a spatial attention submodule based on self-attention mechanism,which complements the channel attention sub-module and further enhances the network’s feature representation capability.To verify the performance of the dual-attention module,experiments and result analysis are conducted on a constructed point cloud classification dataset using Point Net and Point Net++ as baseline networks.The results demonstrate the effectiveness of the proposed module,which can be transferred to structurally similar networks,and both sub-modules have a positive impact on improving the classification accuracy of the networks.(3)The network structure of U-Net was studied,and a "encoder-decoder" structure was built based on it.In order to better adapt to the characteristics of sparse and dense point clouds in outdoor scenes,cylindrical voxels were used to partition the point cloud,reducing the ratio of empty voxels.Furthermore,in order to extract richer features,improve the network’s feature extraction capability,and enhance its perception ability for small objects,this thesis proposes a multi-scale feature fusion module based on 3D sparse convolution and a channel attention module,which helps the network form more discriminative features.Finally,the proposed network achieved high segmentation accuracy on the publicly available Semantic KITTI dataset for laser point cloud semantic segmentation,as demonstrated by quantitative and visual results comparisons. |