Font Size: a A A

Feature Extraction And Generation For Person Re-Identification

Posted on:2023-02-25Degree:MasterType:Thesis
Country:ChinaCandidate:X SunFull Text:PDF
GTID:2568306821954079Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Person re-identification(Re-ID)technology is a popular area of current computer vision research,i.e.,finding images of the pedestrian belonging to the same identity from an image library obtained by cameras with non-overlapping views based on the clothing,posture,hairstyle and other information of pedestrians.With the popularization of surveillance video in daily life,person Re-ID can be used for people searching in shopping malls,suspects searching by police and other scenarios.Video-to-video(V2V)and Image-to-Video(I2V)person Re-ID are more practical.However,how to extract the spatial-temporal feature in the video precisely and completely in V2V person Re-ID needs to be solved urgently.On the other hand,I2V person Re-ID also suffers the problem that the features are difficult to be compared due to the asymmetric amount of information between images and videos.Utilizing computer vision and deep learning,this thesis carries out researches focusing on the temporal and spatial feature extraction in V2V person Re-ID,and the problem of information asymmetry in I2V person Re-ID.The detail contents are as follows:(1)Extraction of video temporal information.A Long-short Temporal Information Fusion(LSTFF)architecture is proposed for V2V person Re-ID.Firstly,the input video sequences are segmented in chronological order.The short-term information features of different clips are obtained by different pseudo-3D modules supplemented by improved Non-local attention module.Then a global attention layer is utilized to aggregate all clip-level features to obtain the long-term informative features of the input video.The Top1 indicator of the LSTFF network has achieved 89.6%,89.3%and 94.7%on the MARS,iLIDS-VID and DukeMTMC-VideoReID datasets,respectively,increasing by 2.3%,2.7%and 7.4%respectively,compared with baseline.(2)Refinement of video spatial features.A Mask-guiding Reference Attention(MRA)module is proposed in this thesis,which uses specific mask to guide the reference attention to make the network focus on the noteworthy parts of the pedestrian.So that the influence of the useless information,such as the background,can be reduced,with the effective spatial information being accumulate continuously.In addition,the MRA module is combined with LSTFF network in this thesis.The temporal and spatial information of video sequence can be extracted simultaneously to optimize pedestrian features.Compared to baseline,the Top1 performances of the STFF network on the MARS,iLIDS-VID,and DukeMTMC-VideoReID datasets have been improved by 2.9%,0.1%and 7.2%,respectively,reaching 89.5%,83.4%,and 95.3%,respectively.(3)Generation of spatial-temporal features of image.To solve the information asymmetry problem in I2V person Re-ID,Generative Adversarial Spatial-temporal Feature(GASTF)network is proposed in this thesis.The GASTF network is divided into feature generation flow and feature extraction flow.A single image is used as input of feature generation flow,with the spatial-temporal features generated by the adversarial generative network.Meanwhile,a ResNet50 network based on a Nonlocal self-attentive module is utilized in feature extraction flow with a video as the input,to extract the spatial-temporal features of the input video sequence.Finally,the features of the dual-stream network are forced to be similar so that the network parameters of the feature generation flow are optimized to supplement the realistic spatial-temporal feature to the image.The GASTF network applies adversarial generative networks to I2V for the first time.The Top1 metrics on the MARS,iLIDSVTD and DukeMTMC-VideoReID datasets achieve 74.3%,45.3%and 75.6%,respectively.
Keywords/Search Tags:person re-identification, deep learning, video spatial features, video temporal features, image-to-video person re-identification
PDF Full Text Request
Related items