Font Size: a A A

Research On Cross-Modality Person Re-Identification Based On Deep Feature Representation Learning

Posted on:2024-04-06Degree:MasterType:Thesis
Country:ChinaCandidate:L FanFull Text:PDF
GTID:2568307055478124Subject:Electronic Information (Field: Computer Technology) (Professional Degree)
Abstract/Summary:PDF Full Text Request
Person re-identification(Re-ID)is the task of retrieving and matching pedestrians from non-overlapping cameras.Cross-modality person re-identification(VI-Re ID)is able to utilize information from both visible and infrared images to perform the pedestrian retrieval task.Since VI-Re ID enables round-the-clock surveillance of the environment,it is of great importance for practical work such as intelligent surveillance and crime-solving.However,VI-Re ID is susceptible to intra-modality differences such as changes in surveillance viewpoint,person pose differences,and background noise.Deep learning-based feature extraction methods are able to extract more robust deep features,alleviating these problems to some extent,but the huge intra-modality and cross-modality variation still makes it difficult to learn discriminative component features and cross-modality sharable features.In addition,modal channel variation and a large amount of background noise make the automatically cropped images of the pedestrian detector not well aligned,distracting from the pedestrian representation learning.To address the above issues,the key work and innovations in this paper are as follows.(1)A deep network model based on multi-scale attention component aggregation is proposed to address the problem that both discriminable features in the intra-modality and shareable features in the inter-modality are difficult to extract due to large differences in color features in VI-Re ID.Firstly,an intra-modality multi-scale attention module is designed for a single modality,which can fully mine the fine-grained cues within the modality and thus reduce the interference of background noise.Then,based on this,a fine-grained component aggregation mechanism based on scale partitioning and channel-space joint soft attention is designed to fuse multi-scale fine-grained features from local to global in a cascading manner to better extract shareable features across modality images.Finally,the method experimented on the datasets SYSU-MM01 and Reg DB,and the results showed that the m AP of the method in this paper reached 71.61% and 74.33% respectively;the Rank-1 values reached 64.40%and 82.91% respectively.(2)A deep network model based on graph correlated attention features alignment is designed for VI-Re ID,where different camera viewpoints and large changes in pedestrian pose lead to feature detection misalignment and weak robustness to noisy samples.Firstly,the designed graph correlated attention module focuses on both coarse-grained and fine-grained features to construct graph relations of pedestrian features,which enhances the cross-modality contextual relationship representation and improves the robustness of noisy samples.Subsequently,an inter-modality feature alignment module for pixel association of cross-modality local features is further designed to exploit the dense correspondence of cross-modality person images to calculate feature similarity in a probabilistic manner and close the distance between the same classes under cross-modality.Finally,the models in this paper were experimented on the datasets SYSU-MM01 and Reg DB respectively,and the m AP reached 80.91% and 88.92%,and the Rank-1 reached 74.28% and 90.90%,respectively.(3)In order to verify the effectiveness of the method proposed in this paper,a practical deployment and operation were carried out in a VI-Re ID system applied to a campus field.The feature extraction network and feature alignment method proposed in this paper were used to extract features and optimize features respectively to obtain pedestrian retrieval results.A person retrieval test of the system showed accurate detection results,and the experimental results proved that the system in this paper has a strong practicality and ability to be applied on the ground.
Keywords/Search Tags:cross-modality person re-identification, sharable feature, feature representation learning, attention mechanism
PDF Full Text Request
Related items