Researches On Acceleration Method Of Neural Network Inference For Object Detection

Posted on:2021-01-06

Degree:Master

Type:Thesis

Country:China

Candidate:X J Liu

Full Text:PDF

GTID:2428330602482329

Subject:Integrated circuit engineering

Abstract/Summary:

PDF Full Text Request

Object detection is a popular research direction in the field of computer vision,which is widely used in many fields such as intelligent security,smart city,power inspection,driverless and so on.Compared with the traditional algorithm based on hand designed features,the object detection algorithm using depth neural network has achieved a huge improvement in both detection accuracy and detection speed.Deep neural network is a storage-intensive and computation-intensive algorithm,which can run in real time on high-performance GPUs,but it is difficult to implement on the embedded platform with limited storage and computing resources.Therefore,it is of great practical significance to study how to deploy the object detection algorithm on the embedded platform while ensuring the detection accuracy.In this paper,combined with lightweight neural network,model compression and calculation acceleration technology,a real-time object detection system is implemented on the embedded ARM platform based on the YOLOv3 object detection algorithm.The specific work is as follows:(1)For YOLOv3,this paper compares several mainstream lightweight neural networks,and finally selects MobileNetV2 with few model parameters and strong feature extraction ability instead of darknet53 as the backbone network of YOLOv3.(2)For the model compression of convolutional neural network,a channel pruning based model compression method is used in this paper.This method mainly focuses on channel pruning for the convolutional layer in front of the batch normalization layer.Each channel of the feature map has a scaling factor of the batch normalization layer corresponding to it.The L1 regularization penalty can be applied to the scaling factor to make it sparse,and then the importance of the channel is evaluated according to the scaling factor.Proportionally cut the channels of lower importance,so as to achieve the purpose of model compression.Experiments show that this method can significantly improve the inference efficiency of the network while ensuring accuracy.(3)In order to transplant the object detection algorithm to the embedded ARM platform and X86 platform,according to the characteristics of YOLOv3,this paper saves storage space and memory access times by merging adjacent network layers.In addition,the vectorization instruction and OpenMP technology are used to optimize and accelerate the intensive computing of convolutional network.Experiments show that these two methods can effectively improve the inference speed of convolutional neural network in embedded ARM platform.

Keywords/Search Tags:

object detection, deep neural network, embedded, model compression, computing acceleration

PDF Full Text Request

Related items

1	Model Compression And Forward Acceleration Based On Embedded Deep Neural Network
2	Simplification Of Deep Models:Storage Compression And Computational Acceleration
3	Research On Embedded Object Detection Based On Deep Neural Network Compression
4	Design And Development Of Object Detection System Based On Embedded GPU Paltform
5	Acceleration And Optimization Of Deep Learning Algorithm Based On Embedded GPU Platform
6	Research On Model Compression And Acceleration Based On Network Growth Method
7	Research On Compression And Acceleration Of Deep Convolutional Neural Networks
8	Development Of Model Compression And Inference Acceleration Algorithms Of Image Super Resolution Deep Neural Networks
9	Acceleration And Compression Of Scene Text Detection Model
10	Research On Compression And Acceleration Of Deep Neural Network Based On Model Pruning