FPGA Implementation Of Object Detection Based On Deep Learning

Posted on:2020-01-20

Degree:Master

Type:Thesis

Country:China

Candidate:D Y Zou

Full Text:PDF

GTID:2428330590495299

Subject:Instrumentation engineering

Abstract/Summary:

PDF Full Text Request

Object detection is widely used in civil and military fields such as artificial intelligence,medical research,and national defense security.Compared with the traditional algorithm,the performance of deep learning-based object detection algorithm which uses Convolutional Neural Network(CNN)to extract features and complete image classification and positioning is greatly improved.However,CNN often has variable layer parameters and structure,a large amount of parameters and calculations,which makes object detection algorithm difficult to apply in embedded applications with limited resources requirements,requiring high speed and low power.Compared with GPU and ASIC embedded platforms,FPGA has the advantages of low cost,reconfigurability and high energy efficiency.This paper implements hardware acceleration of deep learning-based object detection algorithm on FPGA hardware platform.The main research work is as follows:1.Based on the ZYNQ 7100 heterogeneous hardware platform and combined with the hardware accelerated analysis results of CNN-based object detection algorithm,the paper completed the research task division based on the software and hardware collaborative design ideas and overall architecture design under certain design requirements.2.Based on the overall architecture design,this paper uses the Roofline model to evaluate the theoretical performance of typical deep learning-based object detection algorithms when implemented on the ZYNQ 7100 hardware platform.At the same time,considering the factors such as algorithm detection accuracy and model complexity,the object detection algorithm Mobilenet-SSD is selected,which is most suitable for deployment on the platform.After that,the detection principle and network structure of Mobilenet-SSD algorithm are analyzed so that clarified the software and hardware task allocation scheme for Mobilenet-SSD algorithm.3.Considering standard convolution and the DW(DepthWise)convolution of depth separable convolution in the Mobilenet-SSD,the papaer designs the CNN accelerator in the programmable logic part based on hardware optimization techniques such as parallelism,pipeline and double buffering,and uses the Roofline model to find the best block and parallel calculation coefficients of the CNN accelerator based on the block convolution idea.To ensure that the Mobilenet-SSD accuracy is not lost,the data type processed by the CNN accelerator is 32-bit floating point.Then,the CNN accelerator is called in the DMA data transmission mode,and the function realization of the PS part is completed.4.At the end of this paper,the function verification and performance test were carried out on the GVI CXZ7100 development board.The test results show that the design of this paper is correct and fully meets the requirements.And when the on-chip power consumption is only 8.527 W,the peak computing performance of the CNN accelerator can reach 26.67GOP/S.The processing speed of the CNN accelerator is about 110 times faster than that of the CNN accelerator without using the CNN accelerator.Compared with other related researches,CNN accelerators in this paper have certain advantages in both computational performance and detection throughput.

Keywords/Search Tags:

Object Detection, CNN, FPGA, Hardware Acceleration

PDF Full Text Request

Related items

1	Research And Implementation Of Object Detection Acceleration Method Based On FPGA
2	Acceleration And Implementation Of Object Detection Algorithm Based On FPGA
3	FPGA Implementation Of Object Detection Based On Deep Learning
4	Hardware Accelerated SoC Design For Object Detection Based On RISC-V CPU
5	An Implementation Of FPGA Algorithm Base On FPGA
6	Design Of Lightweight Object Detection Algorithm And Hardware Acceleration Implementation
7	Design Of YOLOv3-Tiny Algorithm Based On FPGA
8	Lightweight Convolutional Neural Network Indoor Object Detection Algotithm Design And Hardware Acceleration
9	Design And Implementation Of Lightweight Object Detection System Based On Neural Network
10	Compilation Optimization And Hardware Acceleration Of Object Detection Algorithm Based On Regional Proposal Network