| With the development of laser sensing technology,it is more convenient to capture 3D data.The analysis and understanding of 3D point cloud data has been an important research direction in the field of 3D computer vision.How to perform learning and representation of the features of 3D point cloud data efficiently and accurately is the key problem that needs to be solved urgently to achieve deep learning analysis of point clouds.In recent years,deep learning techniques have made significant breakthroughs in a variety of tasks in the field of computer vision,in which convolutional neural networks play a key role as the core of feature encoding.However,due to the disorderly,sparse,and irregular nature of point clouds,there are still many challenges in using deep learning techniques for analysis on point clouds: 1)hierarchical deep learning models must organize the data structure of point clouds to construct the local grouping and hierarchical sampling required for feature encoding networks.However,traditional methods ignore the influence of the spatial features of the points themselves and are unable to adaptively organize the point cloud data according to the spatial features,while the low utilization of computational resources reduces the training and inference speed of the model.2)Point Cloud is a non-Euclidean data structure,therefore,regular deep convolutional networks on images are difficult to directly extend and apply to the analysis and understanding tasks of3 D point clouds,it is necessary to design specific spatial convolutional filter generation methods need to be designed for point cloud.To address these problems,this paper provides an in-depth exploration of point cloud deep learning,aiming to solve the common problems of deep learning on point cloud analysis.Specifically,the main research work of this paper is as follows.(1)Hierarchical fast data structuring of point clouds based on spatial features.Due to the disordered nature of point clouds,sampling and local grouping are necessary data structuring steps in point cloud hierarchical deep learning architectures.Most existing point cloud hierarchical deep learning adopts iterative farthest point queries for sampling and searching neighborhood points around the sampled points for grouping.This process relies on a large number of Euclidean distance calculations,and most of these distance calculations are repeated during the point discovery process.In addition,the process of finding the sampled points is difficult to be parallelized,which reduces the inference speed of the model.To address these problems,this paper proposed a data structuring method based on spatial feature transformation,which structured the point cloud hierarchically from the spatial features of the points themselves,divided the points with similar spatial features into the same group,and elected feature representative points as sampling points from them.Due to these attributes,the data structuring results of the proposed method are stable and easy to parallelize.In addition,it is able to replace the data structuring methods in mainstream point cloud deep learning models in a plug-andplay manner.Experimental results show that the proposed method significantly improves the training speed and inference speed while maintaining the model accuracy.(2)3D point cloud convolution based on frame points attention.Due to the disordered,sparse,and irregular nature of point clouds,it is difficult to define the conventional regular convolution directly on the local context of point clouds.The existing work attempts to generate the corresponding convolution filters using the local position information of the points,but some heuristic experiences,such as distance-based linear interpolation,are used in the calculation process of the weights.However,the heuristic knowledge limits the degree of flexibility of filter generation,which is not conducive to the diversity of filtering feature templates and makes the extraction of local features not comprehensive enough.To address these problems,we propose a spatial convolution based on frame points attention.The nonlinear interpolation of the convolution weights corresponding to the neighborhood points through the spatial attention relationship between the neighborhood points and the predefined frame points reduces the heuristic constraint of filter generation,that means the generation of feature templates is more diverse.In addition,we perform a memory overhead optimization of the proposed convolution to reduce the internal dimensionality of the training process,thus reducing the memory consumption and significantly improving the training speed.Based on this convolution method,we constructed three public point cloud task networks and conducts experiments on widely used datasets.Experimental results show that the method proposed in this work can be competitive with the state-of-the-art on point cloud tasks.(3)Multiple shape-perception convolution based on frame points.It can effectively improve the filter diversity by reducing the heuristic experiences constraint of the convolution weight learning strategy,but causes the complexity of the convolution weight generation method.Shape perception-based convolution learns the filter weights from the local shape features of the point cloud,which makes the filters more correlated with local shapes.Past methods construct star topology based on a single centroid for local shape perception,and this strategy is rough for shape perception,making the learning process of the filter unable to be accurately associated with the local shape,thus reducing the effectiveness of the convolution.In this paper,we propose a new convolutional strategy called frame-point multiple-relationship convolution: multiple centroids of a local shape are used to generate multiple perceptions of local shape features as well as the relationship between local points.These multiple perceptions help generate a more accurate perception of the local shapes.The perceptions are used to learn the feature transformation function of the convolution operator,which in turn allows convolution to learn from various local shapes adaptively and obtain more appropriate convolutional filters for the point clouds.We built hierarchical convolutional neural networks based on the proposed method and applied them to three common point cloud tasks.The experimental results show the effectiveness of convolution.We also conducted experiments and comprehensive analysis to demonstrate and understand the underlying principles of our method.(4)3D object detection network based on random sampling from annular voxels.The object detection task is one of the important applications of 3D computer vision,which is commonly used in automotive autonomous driving and robotics.Models based on point representation can have improved 3D feature encoding capability.However,due to the large scale of point cloud scene data,the point-based representation is quite slow to process.In this paper,we propose a random sampling method based on annular voxel grid,which can quickly perform uniform subsampling of the scene point clouds and improve the pre-processing speed of point clouds.In addition,we propose a feature extraction backbone network for target detection by combining the data structuring method and feature abstraction operator proposed in this paper and verified by experiments on the autopilot dataset.The experimental results show that the method proposed in this paper improves the accuracy of target detection while significantly reducing the inference time. |