Font Size: a A A

Research On 3-D Target Recognition Algorithms In Typical Unmanned Scene Based On Model Conversion

Posted on:2020-08-24Degree:MasterType:Thesis
Country:ChinaCandidate:A Y ChenFull Text:PDF
GTID:2392330590474510Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
As a hot topic at present,unmanned driving faces many challenges in technology.The key problem of target recognition in unmanned driving is the recognition of category,position and orientation.Point cloud data can provide more spatial information than RGB image data,which is more conducive to three-dimensional recognition.However,the large amount of point cloud data brings great challenges to the overall computing and storage capacity,and point cloud is difficult to calculate directly by convolution,which limits the application of point cloud.Therefore,under the current situation,how to better apply point cloud has become the first problem to be considered in the algorithm.In this paper,aiming at the problem that three-dimensional point clouds are not easily understood by two-dimensional convolution network and the amount of data is too large,a multi-scale end-to-end three-dimensional target recognition algorithm is studied based on typical unmanned scene.It is used to obtain classification,location and minimum rectangular box for different size targets at the same time.Firstly,the KITTI data set is studied,and the 3D point cloud data is projecte d on the 2D RGB image to ensure the matching between them.Considering that in two-dimensional RGB image or front view,overlapping small targets in the scene may have strong occlusion and close to each other,and the distribution of point clouds in the bird view has individuality,we choose to convert point clouds into bird view for indirect information acquisition.In this paper,the bird view is fed into convolution neural network,which greatly reduces the computational load.At the same time,it can also transplant mature two-dimensional detection framework for three-dimensional target recognition,which is conducive to improving accuracy.Secondly,aiming at the problem that the target size difference in bird view map is too large and the size of some target is too small,which makes it difficult to recognize the target,this paper designs a multi-scale end-to-end target detection network to solve the problem of obtaining the target category and location from the bird view map.Considering the difference of target size,the network uses feature pyramid structure for multi-scale fusion,and uses dilated convolution to expand the field of Receptivity on the same scale.Batch normalization and Leaky ReLU are also used in the network to keep the network updated and converge quickly.In addition,this paper also uses graphics operations such as expanding and searching for graphic corners to carry out preliminary target recognition,acquire the orientation attitude of the target,and analyze the reasons for su ccess and failure.Then,in view of the fact that one-time training method in multi-objective training network is easy to make the network hovering between multiple objectives and not easy to converge,this paper proposes a segment-guided network training method,that is,increase of network learning objectives step-by-step.In this paper,we choose the target location detection network to improve the subsequent algorithm.Considering the different dimensions of the orientation attitude and location of the target,we expand the dimension after obtaining the feature map to extract more features about the orientation and improve the recognition effect.The clustering method based on IOU is used to obtain more suitable prediction rectangular box priori,which can help the regression of the smallest rectangular box.At the same time,a new loss function is proposed,and GIOU is used to help target attitude regression.Finally,we have carried out a series of ablation learning for the functional modules of network design,and the effectiveness of network design is fully proved by the comparison of the experimental results.At the same time,the three-dimensional transformation of the target recognition results based on the bird's-eye view is carried out to obtain the three-dimensional stereo detection frame,and the experimental results are compared with other algorithms,which proves the feasibility of the algorithm.
Keywords/Search Tags:Unmanned driving, Three-dimensional target recognition, Point cloud preprocessing, Convolutional neural network, Multi-scale fusion, Segmental guidance training method
PDF Full Text Request
Related items