Font Size: a A A

Research On Person Re-Identification Based On Cross-Modal Data

Posted on:2024-05-16Degree:MasterType:Thesis
Country:ChinaCandidate:Y S LiFull Text:PDF
GTID:2568307127453874Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Person re-identification refers to the image matching of pedestrians with the same identity in the case of one or more cameras.However,due to the changes in light and monitoring background,only visible light cameras cannot capture clear and reliable images of pedestrians,so a large number of night time pedestrians cannot be effectively and accurately identified.The importance of infrared cameras at night is also gradually reflected.Therefore,cross-modal person re-identification arises at the historic moment,that is,to identify pedestrians with the same identity across non-single modal cameras.This direction has now become a key research field in the industry.However,due to the huge gap between the pictures taken by cross-modal cameras and their completely different styles,it is difficult to conduct more accurate classification and judgment even through manual operation.How to solve the huge difference in cross-modal perdestrain re-identification is a difficult and key problem in current research.This paper mainly studies the cross-modal person re-identification method,mainly aiming at infrared and visible light scenes,the classification and matching of person images with completely different styles.Deep learning methods such as attention mechanism,collaborative learning and intermediate domain modes are used to solve problems such as huge color difference,attitude change and small data set of cross-modal pedestrian images,so as to improve the accuracy of the existing cross-modal person re-identification model.The main work has the following aspects which is completed in this paper:(1)An Infrared-Visible Cross-Modal Person Re-Identification via Dual-Attention Collaborative Learning is proposed,which combines channel and spatial attention depth features to obtain supplementary information for multiple classifiers through cross-modal distribution alignment constraints.In the aspect of feature learning,channel attention mechanism and local spatial pooling method are used for feature extraction.Then,a multi-classifier cross-modal distribution alignment strategy is proposed at the decision level of the network.This makes better use of complementary partial and global information between modal shared classifiers and modal specific classifiers.The experimental results show that the proposed method is feasible and significantly superior to the baseline method,with Rank-1 accuracy of 57.33% and mAP accuracy of 54.49% on SYSU-MM01 dataset.Rank-1 reached 85.33% and mAP 82.10% on the smaller RegDB dataset.(2)A Cross-modal Person Re-identification based on Instance Normalization Style Fusion and Tri-triplet Loss is proposed,which utilizes instance normalization to extract the style information of the shallow network,combines the channel attention mechanism of the excitation squeeze mode to extract the focus of attention,and fuses the domain invariant style information of the two modes to form the input of the third branch.Then it is fed into the three-branch network with shared parameters for common feature extraction.In terms of feature space constraints,a new three-mode triplet loss function is proposed,which strengthens the interclass constraints by limiting the modes of positive and negative samples and anchor points.Various experimental data show that the proposed method can effectively enhance the accuracy of the network by enhancing the feature style information and strengthening the modal constraints.Rank-1 can reach 58.89% on SYSU-MM01 data set,mAP can reach 56.27%,and Rank-1 can reach 87.87% on RegDB.For mAP,it was 85.04%.(3)An efficient Cross-modal Person Re-identification based on Intermediate Generated Module and Joint Constraint is proposed.The intermediate domain module can avoid the introduction of noise while ensuring efficient extraction of effective information,extract modal specific information in shallow network and generate intermediate modes,reducing the difference of mode conversion.In addition,an adaptive joint constraint is proposed,which uses diversity loss and bridging loss to constrain the generation of modes in the intermediate domain,and uses central triplet loss and maximum mean difference loss to reduce the difference between modes at the decision level.In SYSU-MM01 dataset,Rank-1/mAP reaches 62.83% and 58.79%,and in RegDB dataset,Rank-1/mAP reaches 91.30% and 86.74%,respectively.
Keywords/Search Tags:deep learning, person re-identification, infrared-visible, cross-modal retrieval
PDF Full Text Request
Related items