Font Size: a A A

Pedestrian Tracking Based On Deep Layer Aggregation Network And Joint Loss Function

Posted on:2024-05-10Degree:MasterType:Thesis
Country:ChinaCandidate:M J YuFull Text:PDF
GTID:2568307157965229Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the continuous improvement of the performance of target detection networks and the rise of single-network multi-task mode,one-shot multi-target tracking algorithms have become mainstream.However,in multi-target tracking,the movement of the target causes shape changes or occlusion,which makes it difficult to extract target features and affects the accuracy of target positioning,which will lead to a decrease in target detection accuracy,which in turn affects the performance of multi-target tracking.In view of the above problems,based on the target detection algorithm Center Net and the multi-target tracking algorithm Fair MOT,this paper proposes a multi-target tracking algorithm with improved deep layer aggregation and joint loss function to improve the tracking effect on pedestrian targets.The main work is as follows.(1)Propose a target detection algorithm based on an improved deep layer aggregation.In Center Net,the feature extraction network DLA34 iteratively fuses the information of four different size feature maps to meet the detection needs of multi-category and large-scale target objects,but this is too redundant for pedestrian targets in video surveillance scenarios,which is not conducive to Model optimization.Therefore,this paper optimizes its multi-layer iterative fusion strategy.First,on the basis of the original DLA34,only feature fusion is performed on 3-layer feature maps to reduce redundant feature aggregation operations.Secondly,increase the number of convolutional layers in the layered deep aggregation structure of the network in the third downsampling to extract more accurate feature information,so that the network pays more attention to the scale range of most target objects,and is more suitable for pedestrian target detection.Finally,in order to retain more feature information during the downsampling process,soft pooling is used instead of the traditional pooling method to enhance the target detection effect.(2)Aiming at the lack of flexibility of the Softmax loss used by the Re ID branch in Fair MOT and the loose learning features,a joint loss function is used to optimize it.First,the Triplet loss is introduced to strengthen the inter-class separability between sample features,and a batch normalization module(Batch Normalization Neck,BNNeck)is introduced to alleviate the impact of the inconsistency of the two loss functions.Then,in order to strengthen the intra-class aggregation between sample features,a center loss is introduced to maintain the deep feature center of each category.Finally,the three loss functions of Softmax loss,Triplet loss and Center loss are combined as a joint loss function to optimize the Re ID branch to extract more accurate identity embedding features of the target.(3)Propose a multi-target tracking algorithm based on an improved deep layer aggregation and a joint loss function.Specifically,based on Fair MOT,an improved deep layer aggregation is used as the feature extraction network,and the Re ID branch is optimized with a joint loss function to obtain more accurate appearance features.Then,Kalman filtering is used to describe the motion information of the target,the cost matrix is calculated through appearance features and motion features,and multiple targets are matched between frames by Hungarian algorithm to complete the tracking task.
Keywords/Search Tags:pedestrian detection, multi-target tracking, joint model, deep layer aggregation, joint loss function
PDF Full Text Request
Related items