Font Size: a A A

Research On 3D Object Detection Algorithm Based On Binocular Image And Sparse Point Cloud

Posted on:2023-07-26Degree:MasterType:Thesis
Country:ChinaCandidate:G YaoFull Text:PDF
GTID:2532307097476914Subject:Mechanical engineering
Abstract/Summary:PDF Full Text Request
With the advancement of the intelligent driving process,people begin to turn their attention from 2D object detection to 3D object detection,and a major difficulty in this field is how to obtain accurate object position information.Currently,3D environment perception mainly relies on lidar and camera.Lidar can obtain accurate depth information,but the price is relatively high;image-based 3D detection algorithms are easier to commercialize,but there is a large detection error simply using image data.In response to the above problems,this paper proposes a 3D object detection algorithm based on binocular images and sparse point clouds.The algorithm uses sparse point clouds to correct the predicted depth from the binocular image,and then converts the corrected depth image into pseudo lidar point cloud(the data organization form is the same as that of lidar point cloud),and finally use binocular image and pseudo lidar point cloud as the input data of 3D object detection algorithm,and strive to use low-cost data to achieve the detection effect comparable to real lidar point cloud.The specific research contents of this paper are as follows:(1)Deep image generation network structure improvement.In the depth generation stage,improve the existing binocular vision depth generation network SDN(Stereo Depth Network).And drawing on the model lightweight idea of the Mobile Net series of algorithms,in the feature extraction part,an inverted residual module containing depthwise separable convolution is used to replace the standard residual module,which reduces the amount of network parameters of feature extraction by 40% and reduces the time required to generate a single depth image by0.015 s on average.In addition,the design of the linear bottleneck layer of the inverted residual module can effectively reduce the loss of feature information when passing through the nonlinear activation function.By adjusting the aggregation strategy of the pyramid pooling module in the network,supplemented by dilated convolution,ensure sufficient feature extraction capability.(2)Depth image correction algorithm development.In order to improve the accuracy of 3D detection network,this paper proposes to use the weighted K-nearest neighbor algorithm to correct the depth image in the depth generation stage.The specific method is as follows: firstly,the depth image is converted into a pseudo lidar point cloud,and the initial weight between the point clouds is constructed.Then use the sparse point cloud to replace the pseudo lidar point at the corresponding position,and update the depth value corresponding to the part of the pseudo lidar point.Finally,K-nearest neighbor retrieval is performed in the point cloud space,and the depth values ??of the remaining pseudo lidar points are updated on the depth image according to the weights and retrieval results.The nearest neighbor retrieval in the point cloud makes full use of the geometric information of the point cloud,and the depth correction on the depth image can avoid the complexity caused by the three-dimensional coordinate correction of the point cloud.The correction algorithm finally achieved an average accuracy improvement of 19.87% on the vehicle category.(3)In the final 3D target detection stage,the 3D target detection framework is trained using the prepared binocular images and pseudo lidar point clouds,and the final detection results are counted.And an ablation experiment is designed to compare the contribution rate of the improved depth generation network and depth correction algorithm to the 3D detection results.
Keywords/Search Tags:Deep learning, Convolutional neural network, Weighted K-nearest Neighbor algorithm, 3D target detection, Pseudo lidar point cloud
PDF Full Text Request
Related items