Font Size: a A A

Researches On Acceleration Method Of Neural Network Inference For Object Detection

Posted on:2021-01-06Degree:MasterType:Thesis
Country:ChinaCandidate:X J LiuFull Text:PDF
GTID:2428330602482329Subject:Integrated circuit engineering
Abstract/Summary:PDF Full Text Request
Object detection is a popular research direction in the field of computer vision,which is widely used in many fields such as intelligent security,smart city,power inspection,driverless and so on.Compared with the traditional algorithm based on hand designed features,the object detection algorithm using depth neural network has achieved a huge improvement in both detection accuracy and detection speed.Deep neural network is a storage-intensive and computation-intensive algorithm,which can run in real time on high-performance GPUs,but it is difficult to implement on the embedded platform with limited storage and computing resources.Therefore,it is of great practical significance to study how to deploy the object detection algorithm on the embedded platform while ensuring the detection accuracy.In this paper,combined with lightweight neural network,model compression and calculation acceleration technology,a real-time object detection system is implemented on the embedded ARM platform based on the YOLOv3 object detection algorithm.The specific work is as follows:(1)For YOLOv3,this paper compares several mainstream lightweight neural networks,and finally selects MobileNetV2 with few model parameters and strong feature extraction ability instead of darknet53 as the backbone network of YOLOv3.(2)For the model compression of convolutional neural network,a channel pruning based model compression method is used in this paper.This method mainly focuses on channel pruning for the convolutional layer in front of the batch normalization layer.Each channel of the feature map has a scaling factor of the batch normalization layer corresponding to it.The L1 regularization penalty can be applied to the scaling factor to make it sparse,and then the importance of the channel is evaluated according to the scaling factor.Proportionally cut the channels of lower importance,so as to achieve the purpose of model compression.Experiments show that this method can significantly improve the inference efficiency of the network while ensuring accuracy.(3)In order to transplant the object detection algorithm to the embedded ARM platform and X86 platform,according to the characteristics of YOLOv3,this paper saves storage space and memory access times by merging adjacent network layers.In addition,the vectorization instruction and OpenMP technology are used to optimize and accelerate the intensive computing of convolutional network.Experiments show that these two methods can effectively improve the inference speed of convolutional neural network in embedded ARM platform.
Keywords/Search Tags:object detection, deep neural network, embedded, model compression, computing acceleration
PDF Full Text Request
Related items