| With the continuous development of computer technology,multi-target tracking in the field of computer vision has become an important research direction,which is widely used in video surveillance,unmanned driving,motion analysis and other fields.In recent years,with the support of deep learning technology,multi-target tracking has achieved remarkable research results.However,due to the inadequate utilization of features when the target is occluded or disappears briefly during the motion,the tracking accuracy is reduced or even fails.In order to better adapt to the problem of target occlusion in multi-target tracking,this paper studies the multi-target tracking method based on joint detection,as follows:To address the shortcomings of the multi-target tracking method based on joint detection in the face of the occlusion problem,a multi-target tracking method based on motion prediction and data association is proposed.This method constructs the Re ID offset prediction branch to calculate the offset between the target feature extraction position and the center point of the target detection box.This branch is used to solve the problem of inconsistency between the feature center and the target detection frame when extracting Re ID appearance features,and obtain more accurate Re ID features.The improved Kalman filter is used to estimate the width and height of the target bounding box,which obtains a more accurate bounding box position and improves the quality of the confidence detection frame.In addition,MPDA-Net retains all detection frames,divides high confidence detection frames and low confidence detection frames according to the threshold size,and performs data correlation matching separately to reduce missed detection of occluded targets.To address the problem that the proposed MPDA-Net method only uses local information to extract features and ignores the contextual relationship between adjacent keys.The Transformer-based joint motion prediction and data association detection tracking method is proposed.The method redesigns the Res Net convolution module in the MPDA-Net backbone network to utilize the rich contextual relationships between adjacent pixels and long-range pixels to guide the learning of dynamic attention matrix,enhance the visual representation of depth features,and improve the tracking performance.By analyzing the experimental results of multi-target tracking datasets MOT15,MOT16 and MOT17,the evaluation indexes of multi-target tracking accuracy and detector localization accuracy,the MPDA-TRC tracking method proposed in this paper is verified to have good tracking performance,and the ablation experiments are conducted in MOT multi-target tracking datasets.The experimental results show that the MPDA-TRC method proposed in this paper has a stronger tracking effect compared with typical multi-target tracking algorithms such as Sort,JDE and Fairmot,and can better cope with the problem of target occlusion in multi-target tracking. |