| Visual tracking is an important area of research in the field of computer vision.With the emergence of big data and artificial intelligence,it has been widely used in automatic driving,wise information technology of med,behavior recognition and other fields.The main task of visual tracking is to track a target of interest in a given video sequence and to mark that target with a tracking frame.In various tracking scenarios,influences such as target deformation,occlusion,and illumination variations are encountered to interfere with the tracking process and decrease the tracking accuracy.In order to tackle complex environmental changes,it has become a great challenge for researchers to design tracking algorithms that are more accurate and robust.In this paper,we have conducted an in-depth study and analysis of a series of algorithms for deep learning tracking.In particular,we have studied the UpdateNet based on Siamese networks and the TransT based on Transformer.To consider the above two algorithms still have shortcomings in performance,this paper makes improvements to each of them,as follows:(1)Existing Siamese trackers usually do not update templates or adopt single-updating strategies.However,historical information cannot be effectively utilized when using these strategies,and model drift from complex tracking challenges cannot be addressed.To address this issue,a novel tracking framework that learns the model update with local trusted templates is proposed in this paper.We propose a complementary confidence evaluation method to select local trusted templates in a sliding-window.This provides high-confidence historical information.We also propose a method including linear learning and deep learning to learn to model updates.Different from traditional update strategies,our method combines nonlinear and linear updates to obtain reliable templates with the most abundant historical information,which solves the complex tracking challenges to a certain extent.Finally,the adaptive fusion response maps of the two strategies determine the final tracking based on the confidence evaluation.Experimental results on NFS,UAVDT,UAV123,UAV20 L and VOT2016 show that our method performs favourably when compared with current state-ofthe-art methods.(2)Model update is the key of object tracking.At present,the tracking model based on the Transformer structure solves the defect that the Siamese network tracking loses semantic information due to the use of linear matching in the correlation operation,so it has better tracking performance.However,it ignores the importance of model update,which may suffer from model drift when the target appearance and background change significantly.To address this issue,we propose a robust Transformer tracking via the minimum entropy criterion for model update.The primary idea is to use the minimum entropy criterion to judge the reliability of the template and make it fuse linearly with the initial template to jointly input the feature fusion network and obtain the tracking frame after classification regression.The algorithm mainly uses the minimum entropy criterion to judge the template reliable,and makes the linear fusion with the initial template jointly into the feature fusion network based on the attention mechanism,and obtains the tracking frame after classification regression.First,the template with high reliable is determined based on the classification score of the baseline method TransT prediction head.Then,an updated template is generated based on the template with high reliable using the minimum entropy criterion.Finally,the updated template is applied to the tracking by linear fusion with the initial template.The ablation experiments show that compared with the TransT algorithm,our method gets a gain in terms of robustness and EAO on VOT2016,VOT2018 and VOT2019,respectively.The experiments were also compared with state-of-the-art methods on four mainstream tracking datasets GOT-10 K,La SOT,OTB100 and NFS.The gains performs well. |