Acceleration And Implementation Of Object Detection Algorithm Based On FPGA

Posted on:2019-09-05

Degree:Master

Type:Thesis

Country:China

Candidate:J Wu

Full Text:PDF

GTID:2428330545469473

Subject:Electronic and communication engineering

Abstract/Summary:

PDF Full Text Request

The object detection algorithm has a very wide range of application prospects in the fields of national defense security,transportation monitoring and medical research.The object detection algorithm based on deep learning is to characterize the target object and then classify it.Finally,the target object is displayed.The entire network usually contains millions of neural units and millions of connected units.The computational load is very large.Making the object detection network difficult to use on small-sized and low-power mobile platforms.Currently,the mainstream GPU embedded platforms have low energy efficiency and it is difficult to achieve the real-time detection of object detection algorithms.FPGA provides a large number of design resources,using the idea of parallel computing to accelerate the object detection algorithm,so that it can be used in small-size,low-power embedded platforms.The main purpose of this thesis is to achieve the acceleration of the object detection algorithm on the FPGA platform.This thesis through the comprehensive analysis of the object detection algorithm,finally selects the YOLOv2 network as the object to be accelerated on the FPGA development board.An FPGA accelerator based on OpenCL framework is designed according to the operation structure of YOLOv2 algorithm.In the FPGA accelerator,the convolution layer,pooling layer and Batch Normalization(BN)algorithm in the YOLOv2 network are parallelized and accelerated respectively through a convolution kernel,a pooled kernel,and a BN kernel.This parallel computing approach will significantly reduce computing resources and memory bandwidth,and it will also increase computational throughput.Each core in the FPGA accelerator uses a pipelined computing architecture that enables massive network acceleration.This thesis also adopted the method of quantizing 32-bit floating-point number into 8-bit fixed-point number,which reduced the FPGA memory storage space and the amount of data transmission,and also saved the DSP's computing resources.Finally,the compiled OpenCL code of YOLOv2 network was transplanted to the DE5-Net development board to accelerate the experiment of object detection algorithm.Finally,the feasibility of speeding up the object detection method on the FPGA platform is verified,This experiment accelerated the running time of YOLOv2 object detection to about 450ms on the premise that the power consumption was only 27W.In addition to the acceleration of the YOLOv2 network implemented on the FPGA platform,this thesis also implements an application in security detection.Using the darknet deep learning network and the YOLOv2 network to train the knife,gun,and stick models,we must first create a new data set about the knife,gun,and stick on the basis of the PASCAL VOC 2007 data set,and then train to obtain the weights that can identify the objects.Finally on the DE5-Net development board,the function of detecting the dangerous goods of the knife,gun,and stick can be realized.

Keywords/Search Tags:

Object detection, FPGA hardware platform, YOLOv2 network, OpenCL

PDF Full Text Request

Related items

1	Acceleration Method Research On CNN Related Object Detection Algorithm Based On OpenCL
2	Research And Implementation Of YOLOv2 Network Based On FPGA
3	Research On Remaining Object Detection Algorithm Based On Improved YOLOv2 Network
4	Object Detection Algorithm Acceleration Based On OpenCL
5	OpenCL-based For Radar Imaging Algorithm On FPGA Platform
6	Design And Optimization Of Object Detection System For Neural Network Accelerations
7	Research On Moving Object Tracking Algorithm In Video And Implementation Of Hardware Acceleration
8	Design And Research Of Certificate Text Information Detection System Based On FPGA OpenCL
9	The Research And Validation Of Accelerated Compression Coding In Parallel Based On Hardware Platform
10	Compilation Optimization And Hardware Acceleration Of Object Detection Algorithm Based On Regional Proposal Network