| Intelligent scene understanding has an urgent need for computer vision technology,and the field of visual object tracking is one of the key links.Visual target tracking is to use the target template provided in the first frame,the tracker learns its attribute features,and accurately discriminates and tracks the target from the subsequent frames.After considerable development,existing target tracking algorithms have been able to overcome low resolution,occlusion and other factors to a certain extent to achieve long-term tracking of targets,but under the influence of many factors in tracking live scenes,the tracking algorithms show low accuracy.rate and low robustness.Therefore,this paper proposes a target tracking algorithm based on Siamese network and multi-path feature fusion.The main research contents are as follows:(1)Aiming at the problem that the tracker is not sensitive to the target feature response and the positioning accuracy is lacking,a target tracking algorithm based on the feature optimization model is proposed.Following the idea of convolution block and deconvolution,this thesis constructs a multi-scale reinforcement learning feature selection module,which performs global feature encoding on the feature map under multi-scale without changing the image feature resolution,and normalizes the encoded features.Then,the convolution block is used to align and fuse it with the original feature vector to achieve multi-path feature fusion.The tracker ultimately increases the sensitivity to target feature responses while maintaining network sparsity.Experiments have verified that the algorithm in this thesis has good competitiveness,and this thesis realizes the end-to-end training method in a wider range of applications.(2)Aiming at the flaws in the common high-order feature information discrimination mechanism,which leads to the problem of tracking drift,a target tracking algorithm based on feature modulation and memory learning mechanism is proposed.By introducing a gating mechanism to discriminate high-order tensor feature information,the feature correlation degree in a single-frame sequence image is regulated,so as to use the interactive information in image features to assist target positioning;and analyze the performance of the tracker to extract fine-grained features to construct The segmented memory learning module traverses and extracts fine-grained features of images,reviews and filters the initial feature vectors learned by the tracker,improves the utilization of existing high-order feature information,and enhances the recognition of target feature information.The experimental results show that the proposed algorithm can establish a reliable high-order feature information screening mechanism,and achieve accurate and long-term tracking targets.The tracking ideas mentioned above have been trained offline using GOT-10 k and ILSVRC-VID2015 training data sets to obtain a tracker model,which is verified by OTB100 and VOT2018 test data sets.The tracking speed of the algorithm in the OTB100 data set can reach 59fps(frame/s),and the tracking speed in the VOT2018 data set can reach 44fps(frame/s).Finally,the performance of the tracking idea proposed in this paper is further analyzed through ablation experiments. |