Research On Neural Network Parameter Compression And Inference Acceleration

Posted on:2021-05-28

Degree:Master

Type:Thesis

Country:China

Candidate:Y Zhou

Full Text:PDF

GTID:2370330623965015

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

In recent years,with the explosive research and development of deep neural networks,its powerful feature extraction and fitting capabilities have made it widely used in image recognition,natural language processing,speech recognition and other fields.In order to improve the performance of neural network models,researchers generally design deeper and more complex networks,which will greatly increase the amount of parameters and calculations of the model and requires more and more hardware resources(CPU,GPU memory,bandwidth),the cost becomes very expensive.At the same time,it is very difficult to deploy such a complex deep neural network directly on mobile devices with limited computing resource and endurance(such as mobile phones,drones,robots,and smart glasses).This paper solves this problem from the aspects of improving the compactness of the model and the efficiency of the calculation.The main contributes of this work are:1.Based on lightweight neural network MobileNet,the Tensor-Train tensor decomposition technology is used to compress the 1 × 1 convolution in the deep separable convolution.An adaptive Tensor-Train decomposition algorithm is proposed to solve the complex tuning problem of finding the optimal decomposition rank.For the Cifar-10 data set,the amount of parameters in the model proposed in this paper is only 20%-30% of MobileNet.2.Forward acceleration of the Tensor-Train decomposition algorithm on the GPU side is not obvious,this work uses the strategy of smaller decomposition dimensions and moderate rank decomposition based on the adaptive Tensor-Train decomposition to reduce the number of parameters.This work uses use dynamic programming algorithm to find the optimal calculation order of each layer of network after decomposition,which reduces the calculation amount of the model.3.Set up a real-time target detection network on mobile devices.Experiments show that compared to the SSD target detection network based on the native MoblieNet V2,the method in this paper accelerates the model inference speed up to about 1 time.On the Huawei Honor V10 mobile phone,the number of frames detected per second increased from 15 FPS to about 30 FPS.

Keywords/Search Tags:

Tensor decomposition, Parameter compression, quantization, mobile target detection

PDF Full Text Request

Related items

1	Statistical Model Based On Matrix Decomposition And Tensor Decomposition And Its Application
2	Fourth Order Tensor Decomposition Approach And Its Application
3	Sea Surface Target Detection Based On The Combination Of Multi-polarization Features
4	Research On Tensor Decomposition Based On Parameter Estimation
5	Decomposition And Compression Of Tensor Networks And Their Applications
6	Research On Dense Group Target Resolution And Parameter Estimation Method
7	The Application Of Tensor Decomposition In Neural Network Model Compression
8	Research On Underwater Fish Target Detection Algorithm Based On Bionic Dolphin
9	Multidimensional Parameter Estimation Of Array Based On Tensor Decomposition
10	Quantic Tensor Train Decomposition And Its Application On Feature Dimension Reduction