Font Size: a A A

Video Multi-Object Tracking Based On Feature Aggregation And Information Propagation Under Self-Attention Mechanism

Posted on:2024-03-02Degree:MasterType:Thesis
Country:ChinaCandidate:Y F ZhangFull Text:PDF
GTID:2568307133950759Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Multi-object tracking(Multi-Object Tracking,MOT)technology locates multiple targets in a video and maintains their identities to generate target motion trajectories.This technology has important application prospects in fields such as video surveillance,intelligent transportation,human-computer interaction,etc.In recent years,video multiobject tracking technology based on self-attention mechanism has gradually become an important research direction.However,due to the problems of occlusion,appearance similarity,etc.in complex scenes,video multi-object tracking methods based on selfattention mechanism are still difficult to achieve efficient tracking performance.This thesis focuses on the target feature extraction and temporal dimension feature transmission and update problems under the tracking paradigm based on self-attention mechanism,and mainly studies the feature transmission based on multi-scale channel feature aggregation and the video multi-object tracking method based on masked attention and iterative association.The main research contents of this thesis are as follows:(1)A video multi-object tracking method based on multi-scale channel feature aggregation is proposed.Aiming at the identity switch problem caused by long-term occlusion of targets in complex scenes,this thesis constructs a multi-scale channel feature enhancement network to enhance the local feature loss caused by self-attention calculation,and realizes end-to-end video multi-object tracking based on the tracking query detection module and feature update module of Transformer model.The method effectively improves the ability of extracting key features of targets,and can effectively maintain the consistency of target identity when facing small targets and occluded targets,effectively reducing the target identity switch problem caused by occlusion,and realizing a robust video multi-object tracking method.(2)A video multi-object tracking method based on masked attention and iterative association is proposed.Aiming at the problem of local information loss caused by global attention mechanism in complex scenes,this thesis constructs a model based on masked cross attention mechanism to enhance the ability of focusing on local features,and realizes iterative data association by using low-confidence detection results.The method not only reduces a lot of unnecessary computation and improves the target feature representation ability by using masked attention,but also fully utilizes the detection information to improve the trajectory matching accuracy in the trajectory association stage.
Keywords/Search Tags:Multi-object tracking, self-attention mechanism, feature transfer, masked attention
PDF Full Text Request
Related items