| With the development of Deep Convolutional Neural Networks,DCNN-based algorithms have exceeded traditional algorithms in many computer vision tasks.However,due to the existence of a large number of redundant features,these excellent models often consume a lot of effective storage and computational resources,which makes it difficult for them to be deployed in resource constrained platforms.If we can represent all features with a set of orthogonal features,we can replace the complex CNN with a lightweight CNN without any accuracy loss.Therefore,to obtain a set of orthogonal features in CNNs is of great academic and practical value.This dissertation tries to exploit knowledge distillation to remove redundant features and provide a possible solution to the above goal.The meaning of redundant features is that the model’s performance will not be affected after removing these features.However,many model optimization algorithms will impair the performance of the model.The knowledge distillation algorithm aims at using the prior information provided by complex networks to abridge the performance gap caused by feature removal.In this way,the compact networks can achieve comparable performance to the complex networks,so as to eliminate redundant features better.And this dissertation mainly focuses on knowledge distillation algorithms that can shrink the performance gap between these two types of networks.Feature extraction and classification are two basic process in traditional pattern recognition algorithms,but deep learning algorithms integrate them into an end-to-end system.However,these two factors are still very important,and they influence the performance of models directly.Features are discriminative if the intra-class features are close and the inter-class features are distant.Theoretically,discriminative features are easier to be classified correctly.Moreover,excellent generalization ability for a classifier means that it can adapt to new samples well.And to obtain more discriminative features and generalized classifiers is the pursuit of academy.Therefore,from the perspective of discriminative features and generalized classifiers,our dissertation exploits the strong prior knowledge of complex models to improve the performance of the lightweight model better.Our work removes redundancy more effectively and improves orthogonality between features.The contributions of this dissertation is listed below:1)Propose a knowledge distillation algorithm based on discriminative features and generalized classifiersConsidering the distinct characteristics of feature learning and decision,this dissertation has adopted different strategies to train two types of complex teacher networks.One provides the classifier with generalized prior information for its decision,and guides the classifier to output a smoother probability distribution.While the other imparts discriminative intermediate feature prior to feature extractor,which enables the lightweight student network to improve its ability to extract effective features.In experimental section,different network architectures and datasets are utilized to verify the effectiveness of this proposed algorithm.2)Extend the above algorithm to the case where no training sample is availableMost existing methods assume that the training dataset is available,but not all actual scenarios can meet this assumption.To address the problem where training dataset is not accessible,we proposed a method to obtain a lightweight network with comparable performance,where only the trained complex model is available.To be concrete,the algorithm mentioned above is integrated into the current data-free knowledge distillation algorithm,which makes the classification performance of the lightweight networks closer to that of the complex networks.And experimental results indicate that the proposed method can also perform well when no training data is available.3)Implemented a face attribute analysis system based on this algorithmSuch systems usually have to finish multiple tasks and output many face attributes,which will consume a lot of memory and computational resources.Therefore,the removal of redundant features is of great importance for the face attribute analysis system.Based on the above work,this dissertation implemented a system which can analyse multiple face attributes.With the help of redundancy elimination,this system can achieve comparable performance to complex models with less memory and computational cost. |