| With the rapid development of computer vision technology,image processing technology based on deep learning has been widely concerned by the academic community.As one of the very important research directions in the field of computer vision,multi-object tracking can realize location detection and trajectory tracking of targets,and is widely used in the fields of automatic driving,intelligent security and intelligent transportation.Among them,pedestrian multi-target tracking is one of the most important technologies.The scene for which it is oriented has problems such as serious target occlusion,image distortion and frequent illumination changes,which brings great challenges to the location and tracking of pedestrian targets.How to efficiently detect and track pedestrians in complex scenes has become a difficulty in the research.To solve the above problems,a multi-target tracking algorithm ECT++is constructed in this paper based on Fair MOT from two aspects of detection and tracking:(1)Based on the improvement of the CenterNet target detection algorithm,E-CenterNet is proposed to make it more suitable for detection tasks in complex scenarios.Aiming at the missing detection problem of Fair MOT’s detection module,channel attention SKC is introduced into the backbone network DLA34 to construct the SK-DLA34 backbone network,which enhanced the network’s ability to extract target features;Additionally,a multi-scale spatial awareness module is added to the decoder,and spatial attention is used to enhance features at different scales before fusion,making spatial information in the feature map more salient.At the same time,in view of the problem that the algorithm uses the L1 loss function to ignore the overlapping degree of the detection frame and the label frame,resulting in inaccurate positioning of the bounding box,the GIo U loss function is introduced to supervise the bounding box output by the model,which improves the positioning accuracy of the bounding box.(2)Based on the joint detection and tracking architecture,the tracking algorithm ECT++is constructed with E-CenterNet as the target detection module.Aiming at the problem that the feature map shared by the re-identification branch and the detection branch affect the effect of appearance feature extraction,a bottom-up feature fusion method is used to generate features for the re-identification branch,which enhances the high-level semantic information in the reidentification feature map,thereby generating more Distinguishing appearance feature information;for During the tracking process,the low-confidence target is mistakenly discarded,resulting in the interruption of the trajectory.A re-identification feature search module is proposed,which uses the appearance information of the historical frame to search the current frame to recall the low-confidence target using temporal information,which further improves the trajectory accuracy.Continuity,and finally get the pedestrian multi-object tracking algorithm ECT++.In this paper,ablation experiments and comparison experiments is conducted on the proposed E-CenterNet and ECT++ respectively on the pedestrian multi-object tracking data sets MOT17 and MOT20,verifying the feasibility and effectiveness of the proposed model in the pedestrian multi-object tracking task. |