| In the real-time transmission system for 3D point cloud of autonomous driving application,on one hand,it is required to compress the point cloud to ensure fast and smooth transmission,and on the other hand,it is required to analyze and understand the compressed point cloud to achieve accurate perception of the surrounding scene.However,existing point cloud compression methods only focus on improving compression performance,ignoring the combination with downstream vision tasks.This oversight results in distortions caused by point cloud compression that affect the accuracy of machine vision analysis,leading to a decrease in the performance of visual tasks.This dissertation focuses on the research of point cloud compression methods that oriented to shape classification and object detection in autonomous driving.It aims to solve the issues present in existing compression methods,such as the inefficient representation of point cloud features,complex structure of compression networks,and the neglect of combination of point cloud compression with downstream visual tasks.The purpose of this study is to ensure high performance of visual detection tasks while achieving efficient compression of point cloud.The research work in this dissertation is as follows:(1)An attention-aware downsampling network oriented to shape classification for point cloud.To address the issues of inefficient feature learning and sampling method,as well as the complexity of matching method in existing point cloud downsampling networks,firstly,a point-voxel combined feature extraction method is proposed to extract point-wise,global,and multi-scale local features of the point cloud.Secondly,a sampling network based on attention mechanism is proposed to focus on regions of interest within the point cloud.Subsequently,a row-column constraint module is designed to simplify the matching process for sampled point cloud.A joint loss function comprising the sampling loss,the sampling matrix constraint loss,and the classification loss is constructed to optimize the downsampled point cloud for downstream classification tasks.Considering the convenience of application and resource conservation,a multi-rate point cloud downsampling network is proposed to get point clouds with multiple sampling rate in a single training session.Experimental results on datasets like Model Net40 demonstrate that the proposed downsampling network can achieve a classification accuracy of over 90% of the original point cloud classification accuracy when compressing the point cloud to 1/32 of the original point cloud data,ensuring good task performance while achieving point cloud compression.(2)A lightweight Transformer based encoding network oriented to shape classification for point cloud.Firstly,to enhance the learning and representation ability of point cloud,a lightweight Transformer based on skip attention mechanism is proposed to extract point cloud features.Subsequently,to address the issues of information redundancy among sampled points and the loss of detail information in the downsampling process,an encoding nework based on the lightweight Transformer is build,a downsampling unit on the encoder is proposed to achieve a compact representation of the point cloud,while an upsampling unit on the decoder is proposed to recover the received bitstream into a reconstructed point cloud.At the same time,feature embedding channels are added to both the encoder and decoder respectively to retain more information of the original point cloud.To improve the classification accuracy of the reconstructed point cloud,a joint loss function including rate-distortion loss and classification loss is constructed to train the encoding network and optimize the reconstructed point cloud.Experimental results on the Semantic KITTI and Model Net40 datasets demonstrate that the proposed network improves the rate-distortion performance compared to similar encoding methods,the classification accuracy is increased compared to the previously prososed downsampling network.The task performance is further improved while achieving efficient point cloud compression.(3)A region of interest hierarchical octree encoding network oriented to object detection for point cloud.Firstly,to reduce the amount of point cloud data while highlighting the regions of interest for object detection,the circular grid method is proposed to remove the ground points in the scene.Secondly,to enhance the compression performance while retaining valuable semantic information for object detection,a contextual neighborhood entropy model is constructed using the information of the higher level ancestor nodes,the same level neighbor nodes,and the lower level sibling child nodes.Additionally,to improve the quality of the reconstructed point cloud,a coordinate refinement module is added to the decoder,which uses multi-level octree information to learn coordinate residuals.By adding coordinate residuals to the coordinates decoded by the octree,the accurate reconstructed point cloud coordinates are obtained.Experimental results on the KITTI dataset demonstrate that the proposed network can save more than 10% bitrate compared to similar methods,and improve object detection precision by about 1%,achieving efficient compression and accurate object detection.(4)A point cloud compression and detection system for future unmanned driving application.To achieve intelligent application of autonomous driving,a point cloud compression and detection system for future unmanned driving is constructed,and the effectiveness of the point cloud compression methods proposed in this dissertation are verified in the system,indicating its potential application in the field of autonomous driving. |