Font Size: a A A

Semantic Segmentation Enriched Features Based Pedestrian Detection

Posted on:2020-01-05Degree:MasterType:Thesis
Country:ChinaCandidate:X L XieFull Text:PDF
GTID:2428330572974422Subject:Pattern Recognition and Intelligent Systems
Abstract/Summary:PDF Full Text Request
Pedestrian detection,as a branch of computer vision,has many significant real world applications such as autonomous driving or human behavior analysis.As the rapid developments of deep learning and computing power,many convolutional neural networks based object detection frameworks or pedestrian detection frameworks have been proposed and achieved many significant breakthroughs.The anchor mechanism is commonly used in deep learning based object detection frameworks,which has the problem of learning valid features of the surrounding areas of pedestrians.We aimed to design a better feature learning architecture by introduce the semantic segmentation features.In this paper,we propose a feature fusion mechanism called the feature enrichment unit.The feature enrichment units receive feature maps from the body network layer by layer and convey features in a backward manner,produce multi-scale fused feature maps.Several multi-scale feature enrichment units refine features together by two main streams,which is the P stream and the S stream.The P stream produces multi-scale feature maps with high level semantic features.The S stream produces multi-scale semantic segmentation maps of the original image.The feature maps of the P stream and the S stream will be concatenated as the multi-scale fused feature maps for pedestrian detection.The feature enrichment unit mechanism is easy to embed into existing convolutional neural network based detection frameworks since it receives and outputs feature maps.We design the anchors with the priors produced by the IoU based k-means clustering algorithm.We use an alternative training strategy to train the network for detection and semantic segmentation respectively.We evaluated the proposed framework on the KITTI dataset and achieved considerable detection performances.The ablation experiments indicate that constructing the semantic segmentation branch only does not help improving the detection performance but using them as the extra features does.The gain of detection performance comes from fuse the semantic segmentation features(S stream)with the features of the P stream as the enriched features for detection.We also propose a human tracking model which consists of three modules:the pedestrian detection module,the person re-identification module and the identity manager.We design a spatial bin pooling structure to preserve the spatial features while maintaining the robustness to the feature shift at the same time.We also design a identity mask mechanism used for improving the matching accuracy.
Keywords/Search Tags:Pedestrian detection, Semantic segmentation, Deep learning
PDF Full Text Request
Related items