Research And Implementation Of Hardware Acceleration For Object Detection Network

Posted on:2023-06-18

Degree:Master

Type:Thesis

Country:China

Candidate:Y Y Hu

Full Text:PDF

GTID:2568307025969839

Subject:Integrated circuit engineering

Abstract/Summary:

PDF Full Text Request

Object Detection Network is a deep learning algorithm for locating and classifying objects in input images.It is computationally intensive and parallel,and has been widely used in face recognition,military detection,maritime positioning,automatic driving and other fields.With the development of related technologies,part of the Object Detection Network in order to increase the accuracy of the detection,constantly expand the network structure,increasing the amount of network parameters and the amount of calculation,eventually leading to the network speed also become very slow,it is difficult to deploy on embedded platform with strict requirements on storage resource cost and computing resource cost and certain real-time requirements.Under the requirement of embedded deployment of Object Detection Network,the main work of this paper are as follows:Firstly,the Tiny-YOLOv3 network is selected as the algorithm model of Object Detection Network,and the model optimization design of feature channel pruning,fixed-point quantization of activation value,integer power quantization of weight two and BN parameter fusion is carried out.Secondly,aiming at the optimized algorithm model,a hardware accelerator of convolution layer is designed.The accelerator has an array of highly parallel processing units that can perform convolution operations.Multi-level storage is used to realize ping-pong operation.Calculation after convolution by pipeline design.Finally,a functional simulation verification environment is designed for convolution layer hardware accelerator,and an embedded deployment verification system of accelerator CPU+FPGA is implemented on Znyq chip.The experimental results show that compared with the original model,the optimized Object Detection Network algorithm model produces 5% compression ratio and m AP is 0.47,which can be used for embedded deployment.The convolution layer hardware accelerator generates 48 GOPS computing power without using DSP resources inside Zynq chip,and the total power consumption of the hardware system is 2.656 W.Compared with the existing research,the design of the accelerator has the advantages of less resource consumption and higher performance.

Keywords/Search Tags:

Object Detection Network, Network Compression, Hardware Acceleration, Deep Learning

PDF Full Text Request

Related items

1	Simplification Of Deep Models:Storage Compression And Computational Acceleration
2	Lightweight Convolutional Neural Network Indoor Object Detection Algotithm Design And Hardware Acceleration
3	Researches On Acceleration Method Of Neural Network Inference For Object Detection
4	Research On Software And Hardware Acceleration Of Acc-YOLOv4 Object Detection Algorithm
5	High Performance Artificial Intelligence Computing With Algorithm-hardware Co-design
6	Acceleration And Optimization Of Deep Learning Algorithm Based On Embedded GPU Platform
7	Acceleration,Compression And Evaluation Methods On Deep Neural Networks
8	Deep Neural Network Compression Algorithm And Its Application In Object Detection
9	Research On Embedded Object Detection Based On Deep Neural Network Compression
10	Model Compression And Forward Acceleration Based On Embedded Deep Neural Network