Font Size: a A A

Research On Structured Prediction Based Road Scene Understanding

Posted on:2018-11-12Degree:DoctorType:Dissertation
Country:ChinaCandidate:L XiaoFull Text:PDF
GTID:1362330623950370Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Road scene understanding is one of the key challenges in self-driving cars.This thesis researches on vehicle-mounted cameras and multi-layer LIDARs based road scene understanding.Considering the importance of contextual information in the sensory data,we investigated how to utilize the contextual modeling ability of structured prediction to improve the performance of road scene understanding.The main work and contributions of this thesis are as follows:(1)We proposed a novel structured random forest based monocular road detection algorithm which is highly efficient and a new road marking detection algorithm.Con-sidering the ambiguity of local pixel appearance and the structure of the road in images,we proposed using the structured random forest to achieve patch-level structured predic-tion.In the training phrase,after mapping the structured labels to a discrete label space,the structured random forest can be trained as the same as the traditional random forest.The proposed structured random forest based road detection algorithm takes the contex-tual information in the patch and the structure information of the road into consideration,therefore reduces the ambiguity of local pixel appearance effectively.Besides,by predict-ing the labels of all pixels of a patch in a single prediction,the proposed method is much faster than the traditional pixel-wise random forest classifier.Experiments tested on the KITTI-ROAD benchmark dataset and the unstructured road data collected with our ex-perimental vehicle show that the structured random forest based road detection algorithm outperforms the traditional random forest classifier both in performance and efficiency.In addition,we applied the structured random forest to road marking detection by designing specific features targeted for the task and adopting the similar training strategy.The struc-tured random forest based road marking detection algorithm overcomes the difficulty of choosing thresholds in the traditional methods.Experimental results on the Caltech lane dataset validated the effectiveness of the proposed method.(2)We proposed two conditional random field(CRF)based camera-LIDAR fusion algorithms and successfully applied them to road detection.Cameras and LIDARs have their own inherited characteristics.Stable road detection should integrate the merits of both sensors.The proposed frameworks achieve fine-grained fusion within the frame-work of CRF,overcoming the drawback of traditional fusion methods which are usually dominated by one of the sensors.The proposed FusedCRF builds the random field on the pixel grid.We add an extra potential term for the pixels which are also observed by LIDAR to extend the basic pairwise CRF.The proposed FusedCRF model integrates the cues observed from image and point cloud and the contextual consistency prior in the im-age in a probabilistic model and infers the best labeling jointly.The proposed HybridCRF model improve the FusedCRF further by building the random field directly on the pixels and LIDAR points.The pairwise terms in the energy function explicitly model(1)the con-textual consistency in the image,(?)the contextual consistency in the point cloud,and(?)the cross-modal consistency between the aligned pixels and LIDAR points.The proposed methods combine the fine-grained sensor fusion and contextual modeling.They achieved good performance on the KITTI-ROAD dataset.Especially,the results of HybridCRF on the UM subset rank first on the leaderboard apart from the deep-learning-based ones.(3)We proposed two novel deep convolutional network based semantic scene under-standing algorithms.The first one combines full convolutional network and region based classification network in an end-to-end deep model.After extracting the high-level con-volutional features,the proposed model proceeds in two parallel channels:one is the fully convolutional channel and the other is the region based classification channel which also outputs a pixel-wise score map by adding a region-to-pixel layer.The pixel-wise score maps of the two channels are fused by summation to get the final score.Experiments tested on the SIFT Flow dataset show the effectiveness of this model.We also proposed the dilated fully convolutional densely connected network(DFCDN)for semantic seg-mentation.DFCDN re-purposes the recently developed densely connected network for semantic segmentation by fully convolutional modification.The proposed DFCDN keeps the advantages of the densely connected network,like,parameter compactness,feature reusing and being easier to train.Besides,we skipped some of the sub-sampling layers to obtain feature maps with higher spatial resolution and introduced dilated convolution to the back-ends of the network to incorporate a wider range of contextual information.Experiments on the Pascal VOC segmentation dataset and the Cam Vid road scene under-standing dataset validated the superiority of the model.
Keywords/Search Tags:Structured Prediction, Road Detection, Structured Random Forest, Sensor Fusion, Conditional Random Field, Deep Learning, Convolutional Neural Network
PDF Full Text Request
Related items