| Deep learning models represented by convolutional neural networks often have complex structure,dense parameters and large volume defects while having strong performance,which seriously hinders the application of deep learning algorithms in resourcelimited scenarios.Under this background,model compression technology comes into being,its purpose is to output small-scale and high-precision network models.As a common and effective method of model compression,knowledge distillation(KD)makes the task performance of the lightweight student network model close to that of the large and complex teacher network model through distillation learning between models.This thesis proposes improved knowledge distillation methods for image classification and object detection.The main work is as follows:(1)Adaptive feature decoupling knowledge distillation(AFDKD)is proposed for image classification.The distillation efficiency of the existing distillation methods are inefficient.On the one hand,the structure of the ”teacher-student” model is inconsistent,resulting in differences in the learning ability of the model itself.It is difficult for the student network to directly imitate the feature output of the teacher.On the other hand,the distillation point is often fixed,without considering that the distillation value of different locations in the teacher network is different,and the most valuable distillation location will change according to the sample.AFDKD optimizes the feature of teachers participating in distillation through feature decoupling module,which is more conducive to students’ learning.At the same time,the adaptive weight distribution network is used to dynamically allocate reasonable loss weights for different distillation points according to the input samples.(2)Multi-task Fusion Knowledge Distillation(MFKD)is proposed for object detection.The existing distillation methods often use the features of the teacher model insufficiently,and usually directly use the feature-distillation tactics of the image classification,which is relatively simple.Secondly,due to the task characteristics of object detection,the pixel distribution of the front and back scenes of the sample is usually uneven,which leads to unreasonable calculation of the loss of each region in the distillation process.MFKD has designed corresponding distillation tasks for the above problems: calculate the feature loss through the ”space-channel” attention mask,and more accurately allocate the loss weight of each region of the feature map? The global pixel relationship of the sample is analyzed through the self-attention mechanism,which enriches the distillation feature dimension.(3)Based on the above two kinds of distillation algorithms,a model lightweight system based on knowledge distillation is designed and implemented.For image classification and object detection tasks,the system provides users with the functions of saving and uploading models and data sets,model distillation optimization,and algorithm verification.Experiments on CIFAR-100,Tiny-Imagenet,COCO datasets show that the two types of methods proposed in this thesis can effectively improve the distillation performance and have certain theoretical advancement.The system algorithm verification module also verifies the reliability and practical operability of the proposed algorithm through the fire recognition and pedestrian detection tasks in the UAV scene. |