| Person re-identification is a task to recognize the same person across multiple cameras.Due to its importance in video surveillance and public safety,this task has been extensively studied for decades.Existing mature recognition networks are mainly achieved through supervised methods,which rely on complete and high-quality labels.However,the cost and time of labeling are relatively expensive.In recent years,unsupervised domain adaptive person re-identification has attracted more and more researchers’ attention,and many proposed methods are gradually closing the gap in recognition accuracy with supervised training networks.Among them,unsupervised domain adaptive networks based on clustering methods make it possible to deploy person re-identification networks in reality due to their smaller network models and low dependence on labeled datasets.However,the performance of unsupervised domain adaptive networks based on clustering methods is highly dependent on the quality of cluster-generated pseudo-labels.Therefore,to address the problem of clustering noise,this article focuses on reducing the generation of clustering noise and optimizing the network’s features learning,and specifically conducts the following three works:To address the problem of noise caused by inconsistent person ratios in the dataset,which leads to features being unable to match,this article proposes a semantic fusion layer structure.The spatial semantic fusion module in this structure can adaptively adjust the size of the receptive field,which improves the problem of fixed receptive fields being unable to match features of different sizes.In addition,the channel semantic fusion module in the layer structure preserves fine-grained advanced features by fusing similar features from different channels,thereby improving the representation of person features.To address the issue of domain gap during the process of domain adaptation,this article proposes to fully utilize the self-similarity in the target domain.By horizontally dividing the output feature maps of the network into two parts,global and two local features,clustering is performed separately on them to enrich the pseudo labels of person.Assigning three kinds of pseudo-labels to the same person enhances the network’s ability to distinguish the same person.In response to the issue of the impact of irrelevant local information in images on pseudo-labels,this thesis proposes a correlation score and two optimization schemes for pseudo-labels.Pedestrian images are divided into blocks and subjected to network feature extraction.By exploring the correlation between global and local features,a correlation score is designed.Based on this score,pedestrian irrelevant local feature vectors are converted to average vectors to prevent noise features generated by overfitting.In addition,for global pseudo-labels,pedestrian fine-grained features with high correlation scores are used to refine the weight of global pseudo-labels,thus retaining distinguishable features of both global and local and improving the network’s ability to learn features.In this article,the network was subjected to multiple domain adaptative experiments and thorough comparative analysis using the three mainstream datasets,Duke MTMC-Re ID,Market1501,and MSMT17.The experimental results were evaluated using the m AP and CMC metrics,which demonstrated the effectiveness of the proposed methods. |