| Single object tracking technology has experienced the development from traditional method to correlation filtering method and now popular deep learning method.The tracking accuracy and success rate are constantly improving,and related products have been developed in many fields,which greatly facilitate people’s life.Although the object tracking technology has made great progress,the tracking algorithm is still prone to drift in facing complex scenes,leading to tracking failure.The full convolutional siamese network object tracking algorithm based on deep learning opens up a new idea for object tracking network construction.Feature extraction is carried out on the object and the area to be searched by using the branch of two shared parameters,and the location of the object in the area to be searched is determined by using the similarity measure method.This new idea of simple object tracking has attracted wide attention for its high accuracy and real-time performance.However,the prediction of the object and the discrimination between the object and the background of the fully convolution siamese network become worse when the object scale is changing,the background is chaos,and the motion is fuzzy.In view of the poor performance of fully convolution siamese networks in complex scenes,this paper proposes an improved tracking algorithm strategy for siamese networks.The main research contents are as follows:(1)Channel attention mechanism and spatial attention mechanism are proposed to enhance the discrimination ability of object and background.In the feature extraction process of convolutional neural network,a large number of feature channels and spatial information are generated.Relevant experiments show that only part of such a large number of channels and spatial information are useful for tracking tasks,and the rest are redundant information and even have negative effects.In the training process of siamese networks,integrating attention mechanism can filter the channel and spatial information of feature map,and give high weight to useful information in object tracking network while giving low weight to useless information,highlighting the role of positive information.(2)It is proposed that shallow features and deep features are fused in a residual connected way,aiming to enhance the expression of object features.The fully convolutional siamese network uses only the last layer of features extracted by the backbone network as the object feature,ignoring the role of shallow features and resulting in insufficient expression of the object features.The shallow features can usually represent the spatial details of the object,such as contour and texture,while the deep features represent the high-level semantic information of the object.Combining the information of the two features by residual connection is beneficial to the comprehensive expression of the object.(3)An adaptive object box generation network is proposed to replace Siam Fc’s prediction object box generation method to make the generated prediction object box more fit.The strategy of Siam Fc algorithm to deal with object scale changes is analyzed,and it is pointed out that Siam Fc cannot generate appropriate object boxes for arbitrary object scale changes due to the limited number of scaling factors.At the same time,the Siam RPN algorithm introducing regional candidate network is analyzed and pointed out that although its performance is greatly improved compared with Siam Fc algorithm,it also has disadvantages.In this paper,by adding adaptive object box generation network to siamese network,the deficiency of Siam Fc and Siam RPN object box generation is effectively improved.(4)The improved algorithm was tested and compared with other mainstream algorithms on OTB100,UAV123 and VOT2016 data sets.The proposed algorithm achieves excellent performance in complex scenes such as object deformation,occlusion and background clutter,which verifies the effectiveness of the proposed improved strategy. |