| Person re-identification(Re-ID)technology is an image retrieval task to identify person under non-overlapping surveillance cameras,which has been widely used in intelligent surveillance and security systems.In the closed-world setting,existing person Re-ID algorithms perform well when retrieving RGB person images in single-modality.However,in the open-world setting,the algorithm needs to retrieve the same identity between RGB image and Infrared image,which is more difficult.Specifically,the RGBInfrared cross-modality person Re-ID task faces two challenges.On the one hand,the color and other visual information of images under different modalities cannot be aligned,resulting in significant modal differences.On the other hand,the negative disturbances in the real open environment such as chaotic background,overlaps,occlusions,and posture deformation cause drastic changes in the person’s appearance.To address the above challenges,it is necessary to build models to effectively eliminate modal discrepancy and extract more discriminative stable semantic representations.Therefore,the following two algorithms have been proposed to improve the performance of the cross-modality person Re-ID model in this dissertation.(1)The RGB-Infrared cross-modality person re-identification algorithm based on multi-feature space joint optimizationTo eliminate the huge gap between the RGB modality and the Infrared modality,a novel multi-feature space joint optimization(MSO)network is proposed to effectively learn modality-sharable features in both single-modality feature space and common feature space.Firstly,based on the observation that edge information is modalityinvariant,an edge features enhancement module is proposed to explicitly optimize each single-modality feature space.Specifically,a novel perceptual edge features(PEF)loss is designed to constrain the shallow network to preserve more edge information autonomously and enhance the modality-sharable features.Moreover,to increase the difference between cross-modality distance and class distance,a novel cross-modality contrastive-center(CMCC)loss is introduced into the modality-joint constraints in the common feature space.The PEF loss and CMCC loss jointly optimize the network in an end-to-end manner,and extensive experiments demonstrate that the proposed algorithm markedly improves the baseline network’s performance on both the SYSU-MM01 dataset and Reg DB dataset.(2)The RGB-Infrared cross-modality person re-identification algorithm based on stable semantic extractionBased on eliminating modal discrepancy above,to solve the disturbances caused by drastic changes in the person’s appearance,this dissertation further presents a crossmodality person Re-ID algorithm based on stable semantic extraction(SSE),which can establish a more stable matching relationship between cross-modality images.First,a weak semantic fine-grained enhancement(Ws E)module is proposed to guide the algorithm to learn fine-grained person representation based on human eye movement research during image recognition.Meanwhile,counterfactual reasoning is introduced to optimize semantic quality.The visualization results show that the Ws E module could mine more details.In addition,a modality-semantic elimination(Ms E)module is proposed to improve the algorithm’s ability to extract modality-invariant representations.Specifically,based on mutual mean-teaching framework,different modal classifiers are constructed,and the modal gaps are eliminated by minimizing the distribution discrepancy between different classification results.Extensive experiments demonstrate that the proposed algorithm significantly outperforms most existing methods. |