| Object tracking is a highly regarded research direction in machine vision,with its main task being to locate and track objects in real time in video sequences.This requires a series of complex computational steps such as feature extraction,object detection,and template updating,based on the given initial object location in the first frame and sub-sequent frames to determine the scale and position of the object and obtain the object’s motion parameters.Compared to other computer vision problems,object tracking faces various challenges such as object occlusion and scale variation.While some progress has been made in object-tracking algorithms,achieving a truly universal,robust,accurate,and efficient algorithm still faces many challenges.This thesis proposes two object-tracking algorithms based on attention mechanisms and designs a single object-tracking system,as follows:(1)Traditional object tracking algorithms do not consider introducing multi-scale fea-tures,which leads to weak feature extraction ability and low tracking accuracy.This thesis proposes the D-Trans T algorithm based on Trans T,which uses a feature fusion network based on attention mechanisms to replace traditional correlation filtering operations.The deformable attention module in D-Trans T can naturally aggregate multi-scale features,and adaptively focus on the edge and similar target information to better locate the tar-get position.Experimental results show that D-Trans T has a faster convergence speed and better prediction ability than Trans T.Compared with Trans T,D-Trans T improved the convergence speed by 29.4%.On the La SOT dataset,AUC,PNorm,and P reach 65.6%,73.3%,and 69.1%,respectively.The experiment also proves that the proposed tracker outperforms most state-of-the-art trackers.(2)To solve the problem of semantic information loss caused by the local linear matching operation of traditional trackers,this thesis proposes a hierarchical attention network HAT based on Siam CAR.The network focuses on key elements in each region and ignores irrelevant information,thus saving computational resources and obtaining the most useful information as quickly as possible.On the UAV123 dataset,the accuracy and success rate of this tracker improved by 1.9%and 1.2%,respectively,compared to Siam CAR.On the GOT10K dataset,AUC,PNorm,and P reach 61.8%,72.7%,and 50.6%,respectively.The experiment shows that the model outperforms many state-of-the-art trackers on the OTB100,UAV123,and GOT-10K datasets.(3)This thesis develops a single-object tracking system based on multi-scale and attention mechanisms using the Spring Boot framework and emphasizes modularization,reusability,object-oriented design patterns,and dependency injection techniques to im-prove the system’s maintainability and scalability.Additionally,the reliability of the sys-tem and the engineering significance of the algorithm are verified. |