Deep Learning Network Construction Techniques For UAV Video-to-Ground Target Tracking

Posted on:2023-06-22

Degree:Master

Type:Thesis

Country:China

Candidate:X D Sun

Full Text:PDF

GTID:2532306788956459

Subject:Electronic Science and Technology

Abstract/Summary:

PDF Full Text Request

As one of the important research directions in the field of visual target tracking,the UAV video to ground target tracking task has the characteristics of large detection range,strong autonomy and wide application.And it has a broad application prospect in the military and civilian fields,such as battlefield monitoring,traffic supervision,etc.However,due to the special characteristics of the UAV platform,there are difficult problems in the video images captured by the UAV,such as obvious differences of the tracked targets scale,more similar objects around the tracked targets and easy to generate false alarms,and high dynamic update of the tracked targets form,which affect the tracking effect.Moreover,most of the existing tracking algorithms mainly focus on target tracking in natural scenes,and there is no complete solution for the tracking task under the UAV field of view.To address the above challenges,this paper proposes a deep learning network construction technique for UAV video-to-ground target tracking.Based on the Siamese-FC network,we sequentially build a feature enhancement extraction module based on multi-scale expansion convolution,a contextual target information perception module based on attention fusion,and a dynamic update of the template based on variable target morphology.The specific research of this paper is summarized as follows:Firstly,to address the problems of large scale variations of the same target and obvious scale differences between target classes in the UAV tracking process,this paper proposes a feature enhancement extraction module based on multi-scale expansion convolution.The module adopts a multi-branch convolutional layer structure with different sizes of convolutional kernels to replace the traditional convolutional layers in the infrastructure,and further expands the perceptual field of each branch feature map by using the idea of dilated convolution.Thus,we achieve the purpose of making full use of the deep features,enhance the feature expression capability of the model,and realize the accurate grasp and effective extraction of the scale information of the tracking target.Secondly,to address the problem that the interference of similar objects around the tracking target is easy to generate false alarms,this paper proposes a contextual target information perception module based on attention fusion to build a new feature fusion means that adaptively fuses feature map information using two self-contextual attention mechanisms to obtain rich contextual semantic information.The feature maps from two branches are fused by two cross-fusion attention mechanisms to fully obtain the feature map information from another branch,and multiple iterations are performed.Finally,we realize the all-round perception of feature information and enhance the robustness of the model to adapt to more complex tracking environments.Thirdly,to address the problem of real-time and highly dynamic update of the appearance of the tracking target caused by the high-speed movement of the UAV platform during the tracking process,this paper proposes a dynamic update method of the template based on the variable shape of the target.The LSTM controller generates a "read" signal to extract the features in the feature memory to build the final template and use it to obtain tracking results.The LSTM controller is also used to generate "write" signals and store the feature information of the new target into the feature memory with certain rules to update the feature information.The model is able to fully perceive the changes of the target appearance and thus improve the adaptability and stability of the model in the long-time tracking process.Finally,this paper conducts ablation experiments on UAV remote sensing datasets to verify the influence of each module on the overall network model.And through quantitative and qualitative analysis,the effectiveness of the network model proposed in this paper for solving related challenges is systematically verified.

Keywords/Search Tags:

single target tracking, UAV remote sensing videos, Siamese network, multi-level feature learning, dynamic memory

PDF Full Text Request

Related items

1	Research On Target Tracking In Remote Sensing Videos Based On Deep Convolutional Network
2	Deep Learning And Spatio-Temporal Awareness For Object Tracking In Remote Sensing Videos
3	Searching And Technology Of Remote Sensing Target Tracking In Unmanned Aerial Vehicle Videos Based On Depth Perception Modeling
4	Design Of Multi-target Detection Method For Remote Sensing Image Based On Deep Neural Network
5	Research Of Object Tracking Based On Convolutional Neural Network
6	Research On Multi-temporal Remote Sensing Target Change Detection And Classification Based On Deep Learning
7	Research On Target Tracking Method Of UAV Aerial Photography Based On Siamese Networks
8	UAV Video Remote Sensing Target Tracking Based On Deep Convolutional Networ
9	Single Target Tracking And Application Based On Siamese Network
10	Research On Target Tracking Algorithm Of Tethered UAV Based On Siamese Network