Font Size: a A A

Studies On Vehicle Detection And Its State Estimation Based On Deep Learning Method And Synthetic Data

Posted on:2020-01-28Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y WangFull Text:PDF
GTID:1362330602455718Subject:Vehicle Engineering
Abstract/Summary:PDF Full Text Request
Intelligent vehicle is one of the most important directions for the development of automotive technology in the future.Environmental sensing module is an important part of intelligent vehicles.Sensors such as camera,radar,and Lidar are widely used in these systems.Due to its low cost and abundant information,camera has always been one of the most commonly used sensors for intelligent vehicle.However,object detection and its state estimation based on image processing technology still face great challenges.The detection and state estimation of surrounding vehicles,which are important traffic participants,has always be a research focus of intelligent vehicle systems.In recent years,due to the improvement of computing speed and the emergence of a large number of data sets with annotations,data-driven deep neural networks have gained great success in the field of object detection.These methods can also be used for vehicle detection.They take an image as input,and output object with a 2D bounding box in the image coordinate.This type of method is called 2D bounding box detection.However,these methods are for general-purpose object detection,and they are not optimized for vehicle 2D bounding box detection specifically.This makes it imprecise and inefficient when applying such general-purpose object detectors directly to vehicle 2D bounding box detection.The commonly used object detectors usually output 2D bounding box of the vehicle in the image coordinate system.However,such information is not enough for intelligent vehicle system.Intelligent vehicle system needs position,pose and dimension of the surrounding vehicle for accurate decision and planning.Therefore,after the detection of vehicle 2D bounding boxes,it is necessary to estimate the states of their 3D bounding box in the camera coordinate system.Most of the methods fuse other sensor signals such as Lidar with image signal to estimate 3D bounding box states.Since Lidar is expensive,this greatly limits the application of these methods.Compared with 2D bounding box detection,it is more challenging to estimate vehicle 3D bounding box state with a monocular camera.It is hard to get satisfactory accuracy,yet very little research can be found from the literature.Data-driven deep learning method has gained great success in object detection.However,deep learning-based vehicle detection and state estimation methods need a large number of annotated date sets for model training.More complex neural networks have more parameters,and need more data for model training.The collection and manual labeling of real image data is a very tedious task,and can be very inaccurate and inconsistent due to human errors.Synthetic data set can be generated via 3D modeling and rendering technologies based on computer graphics technology.Compared with real data set,synthetic data set is easy to generate and annotate.With the development of computer software and hardware technology,synthetic data set can partially replace real data set in model training.However,generation of high-fidelity and richly-varied synthetic data set still needs further studies,and the influence of each imaging factor on final detection result needs further exploration.Compared with labeling 2D bounding boxes on real data set,labeling 3D bounding box is a much more complex task.Therefore,the number of real data set with 3D bounding box annotation is relatively small.Labeling 3D bounding box often needs the help of other sensors such as Lidar.Even so,it is difficult to obtain accurate and consistent labeling.The 3D bounding box labeling is much easier in synthetic data set.Training 3D bounding box state estimator with synthetic data set is obviously more meaningful,but related research is rare.Due to the domain gap between synthetic data set and real data set,model trained with synthetic data set can hardly get satisfactory quality when testing on real data set.In order to solve these problems,this paper focuses on vehicle 2D bounding box detection,3D bounding box state estimation,synthetic data set generation and its application.The main research contents are as follows:1.Based on the original Faster R-CNN object detector,this paper performed optimization on it specifically for vehicle 2D bounding box detection.In the region proposal stage,this method first generate multi-shape receptive fields through a special network design,making the shape of receptive field more proper for vehicle 2D bounding box detection.According to the shape of receptive field and the perspective effect of imaging,a priori knowledge based method is employed to optimize anchor generation procedure.With this method,the anchor could cover the ground truth 2D bounding box more accurately,and the number of invalid anchors can be reduced.Finally,in classification and regression stage,proposal regions are assigned according to the size of them and the feature stride of each feature map.With this method,the amount of information after ROI pooling is more suitable for final prediction.Compare with original method without optimization,the proposed method gained significant improvement on both detection accuracy and computing speed.2.Based on the vehicle 2D bounding box detection,this paper presents a new vehicle 3D bounding box state estimation method.Current monocular vision-based pixel-level depth estimation method is employed to generate depth map.Then,pseudo point cloud data is generated through geometric computing.With this pseudo point cloud,point cloud-based methods can be used for vehicle 3D bounding box state estimation.Furthermore,the normal vector signal is estimated with the pseudo point cloud,and is utilized for 3D bounding box state estimation.It is proved in the experiments that normal vector signal is beneficial to the state estimation performance.Finally,this paper presents a self-attention module to fuse pseudo point position,normal vector and RGB information.This fusion method could further improve vehicle 3D bounding box state estimation performance.3.This paper proposes a synthetic data set generating and labeling method based on domain randomization.The synthetic data set is used for replacing real training data set.With the physically-based rendering method,synthetic images could be generated with high fidelity.Domain adaptation method is employed to ensure the rich changes of the synthetic data set,which could prevent the model from overfitting.Both photo-realistic and non-photo-realistic synthetic images are generated in order to make use of their different advantages.In experiment part,the proposed synthetic data set is compared with other synthetic data sets with the same vehicle 2D bounding box detector as the test benchmark.Furthermore,this paper also discusses the influence of the randomization of imaging factors on the final detection results in detail.The performance when pre-training with synthetic data and then fine-tuning with a small amount of real data is also demonstrated.4.With the proposed vehicle 3D bounding box state estimation method and the synthetic data set,this paper studies the application of synthetic data set in vehicle 3D bounding box estimation.Networks trained with synthetic data set can hardly get good performance when testing on real data set due to the domain gap.This problem is evident in 2D bounding box detection,and is more serious in 3D bounding box state estimation.This paper tried to solve this problem from two aspects.First,the original synthetic data is enhanced to induce non-ideal factors,so that the trained model could have the ability to resist interference.Second,source domain and target domain features are aligned through unsupervised adversarial training domain adaptation method.With these methods,the performance of model trained with synthetic data set could be improved.The training procedure doesn't need real data set 3D bounding box labels,eliminating the tedious work of annotation.
Keywords/Search Tags:Monocular vision, vehicle 2D bounding box detection, 3D bounding box state estimation, synthetic data set, domain adaptation
PDF Full Text Request
Related items