| Visual tracking is a hot research topic in the field of artificial intelligence vision and image processing.It has a wide range in intelligent video surveillance systems,intelligent transportation systems,video-based human-computer interaction systems,intelligent visual navigation systems,modern military,medical imaging Research and rich floor products.Target tracking is such a process: first,given the target in the first frame,the target position and scale are then adaptively estimated by the target tracking algorithm.In the process of tracking,the complexity of the tracking scene and the variability of the target itself have brought huge challenges to the tracking.Inspired by many excellent tracking algorithms,this article introduces our target tracking algorithm from two aspects: traditional correlation filtering and now popular deep learning.The method of discriminant correlation filtering shows excellent performance in the field of target tracking,and the choice of feature descriptors has an important influence on it.Conventional discriminant correlation filters have achieved significant results through multi-channel and multi-scale feature fusion.However,different features have different ability to describe the environment.If each feature channel is given the same degree of confidence,it may limit the performance of some features.In view of this,this paper proposes an improved C-COT algorithm that can adaptively perform channel weighting for each frame of image.This article uses the average peak correlation energy(APCE)to evaluate the response map corresponding to each feature channel,and uses this to guide the target appearance model to assign different weights to different filters,and then obtains the final weighted feature response map.Peak to locate the target.In addition,the C-COT algorithm uses a continuous learning strategy,and the model is updated every frame.This over-updating strategy leads to over-fitting of the model and reduced speed.Therefore,in order to reduce the redundancy of the samples and improve the quality of the training samples,this paper uses the improved method of peak sidelobe ratio(PSLR)to update the model.In addition,single target tracking based on metric learning still has room and potential for performance improvement.In this paper,we introduce our proposed end-to-end deep metric network object tracking algorithm DMN.This algorithm can jointly learn deep feature embedding and distance metric,and directly generate scalar similarity in tracking.In addition,we propose a loss function suitable for our task.Through joint loss function training,the DMN can simply be applied to any new sequence after the instances in the first frame are initialized.According to the scalar output of DMN and smooth motion constraints,the candidate box with the highest score is taken as the best position of the target object.In the experimental part,we verify the effectiveness of each part through ablation experiments,and compare it with other algorithms through the experimental table performed on OTB-13 and OTB-15.Experimental results show that the two algorithms proposed in this paper have achieved better accuracy improvement,especially in some specific video environments. |