| With the rapid development of modern Network technology and Artificial Intelligence,Neural Network model shows great development potential in the field of computer vision,and sets off a wave of research on deep learning.Shallow neural network has poor learning generalization ability and can not meet the computing needs of users.However,the deep neural network becomes deeper and the demand for computing resources increases rapidly,which severely limits the deployment and application of the model on mobile terminal devices.Therefore,it is of great significance to compress and accelerate neural networks.According to granularity,pruning methods can be divided into structured pruning and unstructured pruning.At present,structural pruning methods mostly use the general properties of filters as unified pruning standards,such as 2-norm,average rank or information entropy.The model compression is achieved by removing a certain proportion of low-standard filters from the network.However,this pruning method tends to ignore the functional integrity of filters and the overall representation ability of the model,which will have an irreversible impact on the model accuracy.Therefore,it is very important to pay attention to the functional integrity of the network model and prune according to the similarity division to maintain the model accuracy.This paper studies the similarity between filter and feature graph in convolutional neural network,explores the balance between model compression rate and accuracy,and achieves maximum model compression on the premise of ensuring model accuracy.Specific research contents are as follows:(1)Aiming at the problem that the functional integrity of the network model is destroyed after pruning,this paper proposes a pruning method of convolutional neural network TCFP based on functional similarity.The feature representation after the maximum activation of filters at each layer is used as its specific extraction function,and the clustering of filters at each convolutional layer is used to achieve function division.Then,the first pruning based on L2 norm is carried out in each filter cluster with similar functions to reduce the functional redundancy of the convolution layer.Then,based on the influence of each cluster class on the accuracy of the whole model,the soft pruning of the model between clusters was carried out on the basis of the first intra-cluster pruning,and the threshold value was introduced to judge,and the model was fine-tuned at the same time.Finally,a large number of experiments were conducted to verify the effectiveness and stability of the proposed pruning method based on filter function similarity,and the proposed algorithm can improve the pruning rate while maintaining the overall characteristic structure of the model.(2)A structured pruning method based on similarity of feature graphs is proposed for mapping between feature graphs and filters.Through calculating the similarity of convolution in the figure and the average rank,we consider the similarity of the rank of comparison at the same time,keep the average rank larger characteristic graph corresponding to the filter,the purpose of the implementation model compression,and adopt the method of average pooling instead of full connection layer reduces the redundancy,the parameters of all connections finally restored network fine-tuning precision.A large number of experimental results show that the pruning method based on feature graph similarity is superior to other channel granularity pruning algorithms in compression performance. |