Font Size: a A A

Research On Full-time Pedestrian Detection Based On RGB And Thermal Fusion

Posted on:2024-02-25Degree:MasterType:Thesis
Country:ChinaCandidate:K FangFull Text:PDF
GTID:2568307115963929Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Pedestrian detection has been intensively investigated by the computer vision commu-nity,due to its diversified applications in Smart City,Intelligent Transportation,Intelligent Security,autonomous driving,etc.However,most of current pedestrian detectors are re-stricted to the benchmarks of color images with good visual conditions,whereas they prob-ably fail to work under bad illumination conditions,complex surroundings or bad weather days which seriously hinder the application of pedestrian detection technology in human life.Thermal images contain salient information of objects with strong anti-jamming capability,which can provide rich complementary information for color images.In recent years,more and more researchers have focused on the research of pedestrian detection algorithms based on multispectral feature fusion.However,most of existing multispectral pedestrian detec-tion algorithms are based on two-stage Faster R-CNN or one-stage detection algorithms with anchor boxes,which suffer from low inference speed and low detection accuracy.To handle this,we choose the one-stage anchor-free YOLOX as the detector,and focuses on how to design a multispectral fusion strategy to effectively solve the above shortcomings with main contributions as follows:(1)Multispectral pedestrian detection algorithm based on multi-scale feature fu-sion and modal complementarity.In order to make better use of the complementarity of two modalities to achieve better detection performance.Firstly,we propose a novel Multi-scale Feature Enhancement module(MFE)which can enrich the semantic and texture in-formation contained in the color and thermal feature maps.Secondly,for enriching the cross-modality information contained in the color and thermal feature maps,fusion method based on complementarity of two modalities was applied.The proposed method improves the detection accuracy while retaining the inference advantage of the one-stage detector.Compared with benchmark,our proposed approach achieves 2.85%lower on MR-2on the reasonable all-day in the KAIST dataset.(2)Pedestrian detection algorithm based on attention guided multispectral adap-tive fusion.The way merging the feature maps with equal-weight ignores the differences in capturing light between two modalities.The strategy of illumination-aware weights cannot solve the problem of regional shadow and The training complexity of the method is high.To handle this.Firstly,we generate attention vectors based on RGB and thermal feature maps.Secondly,cross-modality normalization of attention vectors is processed.In order to achieve adaptive fusion in the feature level,we use the attention vectors to rescale RGB and thermal feature maps.Finally,we conducted comprehensive experiments on KAIST dataset to demonstrate superiority of three lightweight modules based on spatial attention,channel attention and coordinates attention.Compared with benchmark,the three modules are re-spectively achieves 1.32%,1.69%,and 1.17%lower on MR-2on the reasonable all-day.(3)Multispectral pedestrian detection algorithm based on cross-modality self-atte-ntion mechanism.There is significant association between aligned RGB and thermal im-ages,which is hardly considered by existing algorithms.In order to integrate the cross-modality features to improve the detection accuracy,we concatenate the feature maps of the two modalities in the expanded wide and high dimensions.Then we use Transformer to calculate the cross-modality self-attention to integrate the cross-modality features.Finally,the fused feature maps are used to guide the single-modality feature maps,enabling one stream to obtain the cross-modality information of another stream.Our proposed approach achieves 2.14%lower on MR-2on the reasonable all-day in the KAIST dataset.
Keywords/Search Tags:Multispectral, Multi-scale, Pedestrian detection, Feature fusion, Self-attention
PDF Full Text Request
Related items