Font Size: a A A

Research On Acceleration Method Of Convolution Neural Network Model

Posted on:2024-02-01Degree:MasterType:Thesis
Country:ChinaCandidate:X N LiFull Text:PDF
GTID:2568307058982309Subject:Master of Electronic Information (Professional Degree)
Abstract/Summary:PDF Full Text Request
The emergence of deep learning has accelerated the process of informatization,and also promoted the development of convolutional neural networks.At present,convolutional neural networks have been widely used in word processing,sound processing,and image visual processing,making deep learning a qualitative leap in the field of computer vision.To pursue the accuracy of the model,the scale of the convolutional neural network is constantly expanding,the calculation cost is increasing,and large memory space is needed.This makes it difficult to deploy the network model on devices with limited resources or in real-time processing applications with low latency.Therefore,with the increase of the depth of the network model,model acceleration becomes very important and has become one of the research goals of most researchers.To make deep learning programs better applied to resource-constrained devices and reduce computing pressure,it is necessary to accelerate the convolutional neural network model.The essence of the accelerated research on the model is to compress the model.Model compression and acceleration methods generally include pruning,distillation,and quantization.The main content of model pruning is to sparse the weights in the network,but in this process,irregular convolution features will be generated,mainly including unstructured pruning and structured pruning.Model distillation is to take a student model with fewer parameters to learn the generalization ability of a large-scale teacher model,and the result obtained will be better than directly training a smaller model.Model quantization is the bit of the operation weight,which reduces the amount of calculation of the model by reducing the floating-point number.It can be seen that different model acceleration methods also have different research contents.This thesis aims to improve the execution speed by model pruning,design a more suitable pruning method for the existing convolutional neural network,realize the lightweight of the model on the premise of ensuring accuracy,and design the application system on the edge node.The main research contents of this thesis are as follows:In this thesis,through the design of transfer learning,a layer pruning method called Yolov3?Pruning(transfer)is proposed for the target detection model.By testing the pruning part of the model,the time delay,parameter quantity,and detection accuracy of 13×13,26×26,and52×52 feature layers are comprehensively considered,and the optimal result after pruning is selected.Compared with the traditional Yolov3 model,it removes the feature layer with the size of 13×13,reduces the convolution layer structure of the backbone network by 9 layers,and reduces the parameters of the model from 235.37 MB to 76.13 MB,that is,the parameters of the model are reduced by 3 times.To ensure the detection accuracy and overall performance of the model,this thesis also establishes the design idea of transfer learning.Through transfer learning,the baseline model Yolov3 is used as the teacher network,Yolov3-Pruning(transfer)is used as the student network,and the parameters in the trained teacher network are used for self-training.Through transfer learning,the loss of detection accuracy of the pruning model is avoided.In this thesis,we propose Std Prune,a channel-based model pruning acceleration method.Add a scaling factor to the Batch Norm layer of the network model,and use L1 and L2 paradigm regularization on the scaling factor,so that the scaling factor of the channel is associated with the importance of the weight,thereby reducing the target loss.Calculate the weight of each channel in the convolutional layer of the model,compare the mean value of the weight with the weight of each channel,and save it in a tensor,arranged from small to large,according to different pruning degrees,for the model is trimmed.This helps to delete redundant data in the model,which not only ensures the detection accuracy of the model,but also speeds up its execution.In this thesis,model acceleration is applied to edge nodes,and the edge cloud system Edge Det is proposed.Collect image data on Io T terminal devices,upload image data to edge nodes,and deploy trained network models on edge nodes to facilitate image detection and recognition.When the computing power of the edge nodes is not enough,part of the convolution calculation is transferred to the cloud service center to reduce the computing pressure at the edge,so that the model can be deployed at the edge of the network.Through the above three parts,we reduce the scale and calculation of the model and ensure the detection accuracy of the model on the basis of various convolutional neural networks.The experimental results show that the parameters of Yolov3-Pruning are reduced by 3 times compared with the baseline Yolov3 model,and real-time processing is realized.Under the data set CIFAR-10,when Std Prune: VGG19(40% Pruned)model based on channel pruning prunes the weight by40%,the parameters are reduced by 13.59 M compared with the baseline model VGG19,and the detection accuracy is as high as 93.79%,which has achieved good results.Finally,with the help of edge nodes and cloud service centers,this thesis builds an edge cloud system,which enables the network model to be deployed at edge nodes,which is helpful for the rapid implementation of the model and reduces the pressure of the cloud service center.
Keywords/Search Tags:Convolutional neural network, Model compression and acceleration, Transfer learning, Target detection, Model pruning
PDF Full Text Request
Related items