Multi-object tracking(MOT)has always been one of the hot tasks in the field of computer vision.The essence of multi-object tracking problem is to detect multiple targets in the video picture,and associate them to get the target’s action track.Multi target tracking is an indispensable part of intelligent transportation system and autonomous vehicle perception system,which is responsible for detecting and tracking the posture of moving pedestrians in the field of view.However,there are still many problems to be solved in the vision-based pedestrian multi-object tracking model.On the one hand,due to the limitations of the image sensor’s own structure,the vision-based multi-object tracking technology is easy to play an abnormal role in harsh environments or light lines;On the other hand,the shape change,motion blur and background interference of vehicles and pedestrians in traffic scenes have been important factors affecting the accuracy of visual multi-object tracking and association.In order to solve these problems,the main work and innovative research results of this paper are as follows:(1)Aiming at the complex and changeable environment,a feature enhancement module based on attention mechanism is proposed to improve Center Track multi-object tracking algorithm.In the intelligent transportation system,the video information collected by the perception camera often has background interference and variable view angle.Firstly,the DLA-34 backbone network structure and layers are improved by combining the Swish activation function and case normalization to optimize the pedestrian feature extraction process in multiobject tracking tasks and enhance the domain generalization performance of the model;Secondly,based on the self-attention mechanism and pyramid segmentation attention,a feature enhancement module is designed to improve the backbone network to improve the representation ability of multi-scale features.(2)Aiming at the problems of environment occlusion,mutual occlusion between pedestrians and frequent pedestrian access in the field of view,a multi-level association network based on deep affinity matching is proposed.A simple multi-level Re-ID module for pedestrian track data association,a depth affinity association module for pedestrian appearance feature matching and a target occlusion state estimation module for occlusion problems are designed.Then,combined with the Unscented Kalman filter for trajectory prediction,a multi-level affinity matching network is constructed to output the matched pedestrian tracking trajectory.(3)Aiming at the fact that the visual multi-object tracking data set is relatively simple and the data set of complex road scene is relatively small,a complex urban traffic scene data set is proposed.On the basis of the existing laboratory label data set(BUUISE-ADS),the label of the data set with more pedestrians is converted into the MOT data set format to form the multiobject tracking data set BUUISE-MOT.This paper proposes a pedestrian multi-object tracking algorithm based on monocular vision.The algorithm obtains input from monocular camera or video stream,improves and optimizes target feature extraction,target location,and tracking track data association,and realizes high-precision and robust real-time multi-object tracking.The algorithm in this paper has verified the effectiveness of the algorithm in MOT data sets MOT17,MOT20 and self-built BUUISE-MOT data sets.It can match the advanced algorithm while achieving real-time performance,and provides a reasonable solution for pedestrian multi-object tracking in complex traffic scenes. |