Font Size: a A A

Research On Unsupervised Pedestrian Re-Identification Algorithm And Its Application On Cross-Modality

Posted on:2024-07-17Degree:MasterType:Thesis
Country:ChinaCandidate:Z Y LuFull Text:PDF
GTID:2568307136992299Subject:Electronic information
Abstract/Summary:PDF Full Text Request
Pedestrian re-identification refers to the process of locating the target pedestrian within image sets or video sequences captured by multiple non-overlapping surveillance cameras.At present,the majority of research efforts are grounded in supervised algorithms,in which the training data encompasses labeled information.Nevertheless,given the substantial costs associated with annotating data,the theoretical feasibility of this approach is unable to extend to industrial applications.Consequently,the cutting edge of pedestrian re-identification research has shifted towards harnessing unsupervised algorithms to investigate the inherent characteristics of the data.In contrast to supervised algorithms that rely on auxiliary information,unsupervised algorithms necessitate the identification and selection of dependable training data.The challenges presented by data redundancy,as well as the modality disparities between infrared and visible images,render this task particularly demanding.Owing to the accelerated advancement of deep learning,this study endeavors to address the complexities of image processing through the application of convolutional neural networks.Subsequently,two distinct solutions are proposed,effectively augmenting the performance of unsupervised pedestrian re-identification.The primary research components are as follows:(1)Traditional unsupervised models rely on clustering algorithms to divide raw data,and then assign independent pseudo-labels to each cluster to simulate the process of supervised learning.Basic and rough clustering algorithms inevitably lead to noise within the clusters.The first solution starts with cluster optimization,proposing a bi-directional optimization network based on the complementarity of global and local information.Firstly,the network contains a feature segmentation layer,which evenly divides multiple local features from the original feature map along the channel direction.After clustering,there is differential information between the features.Secondly,using the spatial distribution distances of global and local features as optimization factors to refine pseudolabels.It selects the local features with the most differential information to optimize the output results of the global feature classifier,and then the refined global pseudo-labels inversely smooth the pseudolabels of their local features.After the judgment of label consistency,the original data has discarded some redundancy,and the samples participating in training are more robust.(2)Due to the structural differences between infrared and visible light images,there are significant sub-cluster gaps between them in the feature space,meaning that general clustering algorithms cannot merge infrared data with visible light data of the same identity at the initialization stage of the network.To solve the problem of data fusion,the second solution introduces joint supplementary samples to help the network better perform cross-modal learning.The first step is instance-level transformation.By using the style transfer function of CycleGAN,infrared data with added color information can serve as transition images.Secondly,the algorithm designs specific filters to extract contour information and integrate it with the original features,serving as more easily learned intermediate features.Lastly,from the perspective of the embedding space,a spatial sample augmentation method is set up.New samples can control their degree of learning adjacent sub-clusters through artificial settings,and unlike instance samples that require feature extraction,spatial samples are parametric and can be directly used as actual samples for training.The two solutions explore different angles on how to improve model performance.The first solution constructs the difference in information between part features and global features to refine pseudo-labels,with an overall framework that is simple and efficient.Its shortcoming is that it cannot handle the clustering issue of cross-modal data.The second solution improves the original images of the dataset and deals with modal features from both instance and spatial perspectives.Experimental results demonstrate the feasibility of joint supplementary samples,but the downside is that the overall network computation is quite large,resulting in high training costs.
Keywords/Search Tags:pedestrian re-identification, unsupervised algorithm, bi-directional optimization, pseudo-labels, style transfer, spatial samples
PDF Full Text Request
Related items