| Social governance is a crucial aspect of national management.With the advancement of science and technology,the efficiency and level of social governance have greatly improved.In recent years,video surveillance technology has rapidly developed,and an increasing number of surveillance cameras have been installed in cities,collecting vast amounts of monitoring data and providing more effective technical measures for ensuring social safety.One of the key technologies in computer vision for automatic video analysis and processing is person re-identification and tracking.This handles tasks such as person retrieval,identity verification,and trajectory tracking in surveillance data and is a core technology for intelligent video analysis.However,due to the complexity of monitoring scenarios,there are many difficulties in the processing of person re-identification and tracking algorithms,and their performance is still significantly behind practical applications.(1)In person re-identification(ReID),the quality of discriminative features learned by the model is not high.It cannot effectively solve the problem of high inter-class similarity caused by appearance similarity and the low intra-class similarity due to person posture changes and different camera angles.Additionally,when person are obscured,a large amount of irrelevant background information is introduced,leading to spatial misalignment issues.(2)Constructing an effective deep neural network for person re-identification is challenging.Different scenes require tailored networks,and their design mostly relies on human expertise,making it even more difficult as this design greatly depends on the designer’s experience in network design.(3)In multi-object tracking(MOT)for person,moving person or cameras change the scale of the person.Especially the scale changes brought about by moving cameras are significant.This causes algorithms based on motion features(position,speed,IOU,etc.)to decrease in effectiveness.(4)Many MOT algorithms rarely or almost hardly use apparent features in the actual tracking process,resulting in a decrease in algorithm performance.The research work of this thesis focuses on the above issues,and its main work is as follows:(1)Person re-identification fusing multi-granularity features and human knowledge.Extracting distinctive features from person images plays a crucial role in resolving issues of inter-class and intra-class similarity.Improving ReID performance lies in introducing local information into model design,allowing the model to focus on local discriminative features,and introducing domain knowledge into the model.To this end,the thesis innovatively proposes an end-to-end multi-branch MGHK network model that combines multi-grain features with human body knowledge.This model extracts the most apparent appearance features of person for differentiation and learns more fine-grained discriminative features.With human body keypoints,the model pays more attention to foreground information during the feature extraction phase.This model has shown excellent performance across multiple datasets.(2)ReID method based on ensemble domain knowledge part network architecture search.Designing a neural network architecture for ReID currently requires a lot of human intervention and is heavily dependent on the designer’s experience.The conventional NAS method automatically searches out the neural network architecture,which mainly serves as the backbone network for classification tasks.However,there is a fundamental difference between classification tasks and reidentification tasks.The inter class distance of various images in classification tasks may be much greater than the intra class distance between pedestrian images in reidentification tasks,which can lead to many problems in the application of ReID tasks.The thesis modifies the DARTS neural network architecture search space by adding multiple domain parts,creating a neural network architecture search space tailored for ReID,and introducing a ReID method that integrates domain knowledge parts for network architecture search.Experimental data indicates that the proposed method is effective and achieves state-of-the-art performance compared to similar ReID methods.(3)Person multi-object tracking method based on GNN and memory fusion.The focus of the MOT task for person has gradually shifted to data association due to the maturity of object detection.Graph neural network(GNN)is a deep learning model built on graph data,and the data correlation in the MOT problem is essentially a natural graph structure.This thesis uses GNN for data association.Unlike previous GNN-based methods that rely on frame-by-frame association,this work associates all target trajectories in the current frame.A memory structure is introduced to store historical trajectory features of the targets,and the memory fusion network is integrated with the GNN association network to update each network’s parameters simultaneously during training,reducing computational complexity and speeding up the training process.The method has shown effectiveness on the MOT17 and MOT20 datasets and achieves comparable or superior performance to other state-of-the-art trackers.(4)Person multi-object tracking method based on GNN and appearance feature enhancement.In the MOT task,the model is expected to balance appearance and motion features selectively across different scenarios.This thesis has found that in practical tracking,the model rarely,if ever,uses appearance features.However,appearance features are rich in information and have been proven effective in the ReID domain.Based on the earlier mentioned(3)work,this thesis proposes a method to enhance appearance features in the MOT algorithm.This approach uses the TBD baseline architecture from(3)and incorporates the idea of PCB to strengthen the feature extraction module.Comparing the experimental results on MOT17 dataset,this method indeed improves tracking performance by approximately 3%.When compared to similar algorithms(based on GNN for data association)using only the official detector,the performance is also at an advanced level. |