Visual tracking tracking has always been one of the hot research issues in the field of computer vision application.The relevant achievements of visual tracking play a huge role in social production and life,such as precision guidance of weapons and equipment,unmanned driving,video monitoring and other fields have been widely used.Although visual tracking has made great achievements in various application fields,due to the existence of a variety of complex factors and interference in the real environment,the current visual tracking algorithm still faces many challenges and has a lot of room for improvement.This thesis studies the accuracy of feature fusion method and classification regression in the current siamese deep network visual tracking and proposes two improvements to the existing problems.The main work of this paper is as follows:First,a attention-based cascade feature fusion method is proposed to solve the inconsistency of semantic and location information in deep and shallow feature fusion in siamese network visual tracking.In this method,deep features and shallow features are integrated globally in channel information through a cascade attention module,so that the fused features can have better semantic and positional expressiveness.At the same time,we introduce weight network,through which the channel importance of template branch can be transferred to the search branch,so that the search branch pays more attention to the target to be tracked,and pays less attention to the interference around the target,so as to improve the accuracy of visual tracking.The comparison experiment based on OTB100 and UAV123 datasets shows that the improved algorithm proposed in this paper has better tracking performance.Second,in order to improve the accuracy and robustness of target classification and regression in siamese network visual tracking,an attention-based classification regression head is proposed.The classification attention module was added to the classification branch to improve the discrimination ability of the tracking algorithm on the target and background,and the regression attention module was added to the regression branch to improve the accuracy of the visual tracking frame positioning.The comparison experiment of OTB100,UAV123,VOT2016,VOT2018 and VOT2019 benchmark datasets shows that the introduction of attention classification regression head on the basis of the first improved method can further improve the tracking performance of the algorithm. |