| Attention mechanism can effectively improve the ability of deep neural network model to extract task-related features.In person re-identification tasks,it is widely used in the process of feature extraction to learn more person distinguishing features.In addition,using the multi-branch network to obtain the person-distinguished features with diversity is also crucial to improve the performance of the model.In view of the great flexibility of the structure of attention mechanism and deep neural network,how to design appropriate attention module and deep neural network to maximize the performance of the model has naturally become an important research direction in this field.By focusing on attention mechanism and deep neural network structure,this thesis aims to effectively improve the discrimination ability of person re-identification,the main contributions can be summarized as follows:(1)For the problem that the features extracted by the attention module lose the person feature position information,a position-aware attention module that fuses positional encoding and attention module is proposed for the use in person re-identification.The module divides the input feature image into blocks and marks the position,so that the extracted feature not only contains the feature information of person body parts but also contains the corresponding position information.The position information can be used as the prior knowledge to identify person features,thus improving the discriminant ability of the model.By comparing and analyzing a variety of attention modules and positional encoding schemes through experiments,a scheme combining non-local block attention modules and positional encoding based on adaptive learning is proposed to improve the accuracy of network recognition.Extensive experimental results show that the proposed position-aware attention module can be well employed to improve the discriminant ability of network feature extraction.In particular,the proposed CNN with attention could achieve 83.8%m AP in CUHK03-Labeled and 83.2% Rank-1 in CUHK03-Detected,which is 0.9% and 1.1%higher than the original CNN,respectively.(2)A hybrid structured network model fusing Convolution Neural Network(CNN)and Transformer is proposed for the feature diversification problem.The proposed model employs five branches for joint training.In this way,the learned model has a stronger ability in distinguishing different persons.Features extracted by multiple attention modules in Transformer retain more details than features extracted by the traditional convolutional extraction.Therefore,the proposed network not only makes full use of inductive bias and other advantages of CNN,but also maximizes Transformer’s advantages of feature extraction using global information.For the person re-identification task,the use of a parallel Transformer-based network(Trans Re ID)to Feature Pyramid Branch(FPB)could further improve the system performance.Extenstive experiments on popular person re-identification datasets show the superiority of the proposed hybrid multi-branch network.In particular for the largest dataset of MSMT17,the proposed network could achieve66.5% m AP and 84.6% Rank-1,which are 4.1% and 6.2% higher than the CNN,respectively. |