| Object tracking is an important task in computer vision and has broad application prospects.With the development of artificial intelligence,we have gradually entered the era of intelligent informatization,and applications such as intelligent video security,drone cruise and intelligent driving based on object tracking are speeding up our lives.Thanks to the vigorous development of deep learning technology,the Siamese object tracking model using deep neural network to obtain deep features has achieved good performance in both tracking accuracy and tracking speed.However,when these tracking methods are applied to actual scenarios,for mobile devices or embedded hardware platforms with limited computing resources and power consumption,these deep learning-based object tracking methods are difficult to meet the basic requirements of realtime performance.In addition,the performance of terminal hardware devices is difficult to achieve a big leap overnight.In response to this problem,it is extremely important to lightweight these deep models that require high computing resources.The lightweight of the deep neural network model is to substantially reduce the amount of computation and parameters of the model while hardly affecting the accuracy of the model,thereby reducing computing resource requirements and inference delays.The main methods are neural architecture search,lightweight network design,and knowledge distillation,which have been proven effective in image classification tasks.The current research on the object tracking model mainly focuses on improving the tracking accuracy,and there are few researches on the lightweight methods of the tracking model.Therefore,this paper first analyzes the structural characteristics of the Siamese network tracking model and the calculation load of the model,and comprehensively compares various model lightweight methods,and finally designs a knowledge distillation framework for the Siamese network tracking model.The main research content of this paper is as follows:(1)The Siamese network tracking model is structurally divided into a backbone network part and a header processing part.The backbone network part is used for feature extraction and affects the quality of features in the entire network.This part is also the part with the highest calculation amount and parameter amount in the entire network.To this end,this paper designs a single-teacher knowledge distillation architecture for the Siamese network tracking model based on the characteristics and advantages of the knowledge distillation lightweight method,combined with the characteristics of the object tracking task.In this structure,the teacher’s softened Logits knowledge and the attention knowledge of the middle feature layer are defined,and these knowledges are used to guide the training of the student network.By designing the distillation loss,the knowledge of the teacher network is transferred to the student network during training.This paper verifies the effectiveness of knowledge transfer under this architecture from the perspective of model enhancement,and at the same time uses this architecture to train and obtain a lightweight tracking network model.(2)It has become a consensus to use multi-dataset training to improve the generalization performance of the model.However,simply merging multiple datasets for training may reduce the accuracy of the model,because this method ignores the differences between different data sets.Large deep models can learn these differences due to more neurons,while lightweight networks have large limitations in the depth and width of the model.In order to achieve better generalization performance of the lightweight model,this paper expands the multi-teacher knowledge distillation architecture on the basis of single-teacher knowledge distillation.In this paper,the trained models under multiple specific data sets are used as multiple teachers,and these teacher models are used to jointly guide the same student network for training,so as to realize the knowledge fusion and integration of the teacher models under multiple data sets into the same student network.The model enhancement experiments show that this multi-teacher distillation architecture is beneficial for the student network to learn the knowledge of multiple teachers,and the lightweight experiments also prove that this method can effectively improve the generalization performance of the lightweight student model.According to the characteristics of knowledge distillation,this paper designs a single-teacher and multi-teacher knowledge distillation method based on the object tracking task.The former is more suitable for tracking in a single specific scenario,and the latter can be extended to multiple task scenarios.This paper proves the effectiveness of knowledge transfer under these frameworks through a series of model enhancement experiments and ablation experiments.Through a series of lightweight experiments,it is proved that these two methods have good performance in improving model accuracy and improving model reasoning speed. |