Font Size: a A A

Deep Learning Network Construction Techniques For UAV Video-to-Ground Target Tracking

Posted on:2023-06-22Degree:MasterType:Thesis
Country:ChinaCandidate:X D SunFull Text:PDF
GTID:2532306788956459Subject:Electronic Science and Technology
Abstract/Summary:PDF Full Text Request
As one of the important research directions in the field of visual target tracking,the UAV video to ground target tracking task has the characteristics of large detection range,strong autonomy and wide application.And it has a broad application prospect in the military and civilian fields,such as battlefield monitoring,traffic supervision,etc.However,due to the special characteristics of the UAV platform,there are difficult problems in the video images captured by the UAV,such as obvious differences of the tracked targets scale,more similar objects around the tracked targets and easy to generate false alarms,and high dynamic update of the tracked targets form,which affect the tracking effect.Moreover,most of the existing tracking algorithms mainly focus on target tracking in natural scenes,and there is no complete solution for the tracking task under the UAV field of view.To address the above challenges,this paper proposes a deep learning network construction technique for UAV video-to-ground target tracking.Based on the Siamese-FC network,we sequentially build a feature enhancement extraction module based on multi-scale expansion convolution,a contextual target information perception module based on attention fusion,and a dynamic update of the template based on variable target morphology.The specific research of this paper is summarized as follows:Firstly,to address the problems of large scale variations of the same target and obvious scale differences between target classes in the UAV tracking process,this paper proposes a feature enhancement extraction module based on multi-scale expansion convolution.The module adopts a multi-branch convolutional layer structure with different sizes of convolutional kernels to replace the traditional convolutional layers in the infrastructure,and further expands the perceptual field of each branch feature map by using the idea of dilated convolution.Thus,we achieve the purpose of making full use of the deep features,enhance the feature expression capability of the model,and realize the accurate grasp and effective extraction of the scale information of the tracking target.Secondly,to address the problem that the interference of similar objects around the tracking target is easy to generate false alarms,this paper proposes a contextual target information perception module based on attention fusion to build a new feature fusion means that adaptively fuses feature map information using two self-contextual attention mechanisms to obtain rich contextual semantic information.The feature maps from two branches are fused by two cross-fusion attention mechanisms to fully obtain the feature map information from another branch,and multiple iterations are performed.Finally,we realize the all-round perception of feature information and enhance the robustness of the model to adapt to more complex tracking environments.Thirdly,to address the problem of real-time and highly dynamic update of the appearance of the tracking target caused by the high-speed movement of the UAV platform during the tracking process,this paper proposes a dynamic update method of the template based on the variable shape of the target.The LSTM controller generates a "read" signal to extract the features in the feature memory to build the final template and use it to obtain tracking results.The LSTM controller is also used to generate "write" signals and store the feature information of the new target into the feature memory with certain rules to update the feature information.The model is able to fully perceive the changes of the target appearance and thus improve the adaptability and stability of the model in the long-time tracking process.Finally,this paper conducts ablation experiments on UAV remote sensing datasets to verify the influence of each module on the overall network model.And through quantitative and qualitative analysis,the effectiveness of the network model proposed in this paper for solving related challenges is systematically verified.
Keywords/Search Tags:single target tracking, UAV remote sensing videos, Siamese network, multi-level feature learning, dynamic memory
PDF Full Text Request
Related items