Font Size: a A A

Research On Object Tracking Based On Pixel-wise Alignment Siamese Network

Posted on:2023-10-24Degree:MasterType:Thesis
Country:ChinaCandidate:Y XiaoFull Text:PDF
GTID:2568306836972359Subject:Electronic and communication engineering
Abstract/Summary:PDF Full Text Request
Visual tracking,also referred as object tracking,is a classical and active topic in the field of computer vision which usually contains 3 components: detection of interested objects,tracking of such objects frame by frame,and recognizing their behaviors of the tracked objects.Object tracking has a wide range of real-world applications,including visual surveillance,robots,and video and image editing,video game.Object trackers based on the Siamese network has shown great success in competitions and papers in recent years,but there are still some shortcomings,which can be manifested in scenarios such as small target,fast motion and extensive scale variations.Two-stage target tracker is used as the reference network in this paper.Based on the idea of coarse-to-fine,the following three innovations are mainly made:(1)Aiming at the complex scenes with small targets caused by long-distance cameras and rapidly changing target sizes,adaptive dilated fusion model is proposed by adding dilated convolution branches of different sizes to the last layer of the feature extraction network and and adopting feature adaptive fusion after the different dilated convolution branches.Experiments show that adding adaptive dilated convolution fusion strategy can achieve the same effect as adding feature pyramid network,however,compared with the pyramid network,the amount of parameters is reduced and tracking rate is accelerated.Experiments show that adaptive dilated fusion model can effectively cope with complex scenes such as small target and extensive scale variations.(2)Aiming at the information dispersion of the similarity measurement algorithm(cross-correlation algorithm)of the siamese network tracker and the problem that feature fusion strategy increases the receptive field but introduces background noise,the paper proposes a depth pixel cross-correlation algorithm to deal with the above problems by using maximum pooling and average pooling to obtain pixel-level template features to extract pixel-level related information and reduce the impact of background noise.Experiments show that depth pixel cross-correlation algorithm can effectively cope with complex scenes such as background noise and similar targets.(3)Aiming at the conflict of classification features and regression features and the problem that fixed-size candidate anchor cannot cope with size changes and deformation,feature alignment algorithms are introduced to deal with it.The pyramid Ro IAlign module(PRo IAlign)is introduced to solve the problem of fixed proposals and classification and regression features are decoupled by Dual-head-decoupling network.Experiments show that feature alignment algorithms can effectively improve the accuracy of target tracking.
Keywords/Search Tags:Object tracking, Siamese network, Adaptive dilated fusion, Depth pixel-wise correlation, Feature alignment
PDF Full Text Request
Related items