| Object tracking has developed rapidly due to its widespread use in security,drones,autonomous driving,face recognition and other fields,and many creative algorithms have emerged.The Siamese network series algorithm has become one of the most popular algorithms due to its speed and accuracy.However,the tracking algorithm based on the Siamese network cannot cope with the challenges such as similar interference and occlusion due to its similarity matching and failure to make full use of context information.To address the issues,we introduce Reinforcement learning into object tracking,which treats object tracking as an online decision-making process.The work done in this paper is as follows:(1)A two-stage object tracking algorithm integrating the Siamese network and the reinforcement learning decision network is proposed to deal with the problem that the Siamese network fails to make full use of background information and similarity matching.The algorithm uses the similarity matching of the Siamese network to obtain the corresponding map of the target score in the search area,and obtains the approximate target area.Then,we take the target potential position picture as a candidate and send it to the reinforcement learning decision network to obtain the exact target position.(2)We use the meta-learning algorithm maml to train a feature selector and set up a feature pool,use the trained feature selector to judge whether the tracking result is added to the feature pool,and then use the filtered feature information to update the network online.It solves the problem of poor online update quality caused by insufficient robustness of traditional reinforcement learning algorithm online update.(3)We use an improved version of the AC algorithm SAC to train the tracking network,and use Res Net pre-trained on Image Net to replace the original lightweight feature extraction network.The algorithm introduces the concept of maximum entropy,improves the probability of high-quality action,solves the problem that the traditional AC-series-based tracking algorithm is prone to fall into local optimum,resulting in poor target tracking accuracy,and enhances the robustness of the algorithm. |