Font Size: a A A

3-Dimensional Vehicle Detection And Tracking Based On Monocular Vision

Posted on:2022-02-03Degree:MasterType:Thesis
Country:ChinaCandidate:T T ZhaoFull Text:PDF
GTID:2492306332951169Subject:Vehicle Engineering
Abstract/Summary:PDF Full Text Request
Perception system is the key to self-driving cars.Obtaining visual information and trajectory information of vehicles,pedestrians and other targets in traffic scenes is the key to downstream tasks.Among them,3D vehicle detection and tracking are the core task in perception system.Here,a framework combining 3D vehicle detection and 3D vehicle tracking utilizing monocular images is proposed.The following is the specific research content:(1)Aiming at the problem of 3D vehicle detection,a two-stage cascaded 3D vehicle detection network is proposed.In the first stage,based on the feature fusion module of the improved Res Net-34 network design,feature extraction and fusion are performed on the input image;after obtaining the fusion feature map,a 3D attribute regression module is designed to perform parameter regression.This module contains five branches,which are used to return to the five types of attributes of the target 2D bounding box,target local orientation angle,target 3D scale,bottom center projection,and viewing angle category.Aiming at the problem that the 2D bounding box obtained by single-stage network regression is not accurate enough to affect the 3D positioning accuracy,it is proposed to use a cascaded network framework to optimize the more accurate 2D bounding box from the parameter regression module.After obtaining the3 D attributes of the above five types of targets,aiming at the time-consuming problem of solving the 3D pose of the target,the use of cascaded geometric constraints is proposed to solve the target pose: first,the geometric constraints of similar triangles to obtain the approximate target depth;then constructs the 3D-2D perspective transformation equation set,and uses the obtained rough target depth value as the initial value to accelerate the numerical solution speed,and finally obtain Target pose.(2)3D vehicle detection and 3D vehicle tracking,as the two main tasks in the perception system,are inextricably linked.Aiming at the problem that the matching process of the Deep Sort method is more complicated and the tracking task depends on the output of the detection result in the previous stage,a 3D vehicle tracking method based on target similarity matching is proposed.First,build the basic tracking process based on the Deepsort method,and obtain the basic tracker by setting the hyperparameters in the algorithm such as target death time,minimum matching times,and vehicle confidence threshold;then,update the target in the tracking set in the current frame through the Kalman filter,and use the Hungarian algorithm to match the detection result of the current frame with the tracking result.The adjacent frame displacement prediction branch,and the detection and tracking results of the previous frame are used as the additional input of the network,so that the 3D tracking task and the 3D detection task can be executed in parallel.In addition,since the network outputs the displacement of the current frame target relative to the previous frame,the proposed method can match the tracked target and the detected target only by adopting a greedy search method,without the need for complex data association methods.Through the above method,3D detection and tracking of vehicles in a given image sequence are realized.(3)Finally,this study uses the KITTI data set and the nu Scenes dataset to verify the proposed method.For the 3D detection task,the depth estimation error and timeconsuming analysis on the KITTI data set are carried out.The experiment proves the proposed method gains average precision of 32% for 3D detection task,which surpass some existing methods.The experimental results on KITTI and nu Scenes dataset also shows the proposed method gains average multi-object tracking accuracy of 89.5% and27.8% for 3D monocular tracking task.The time-consuming for single frame on both datasets are less than 100 ms.In summary,the method proposed in this research has better performance and efficiency thus has the potential for autonomous vehicles.
Keywords/Search Tags:Intelligent vehicle, 3D vehicle detection, 3D vehicle tracking, Cascaded network
PDF Full Text Request
Related items