The Design And Implementation Of Object Detection Chip Based On Deep Learning Algorithms

Posted on:2020-05-31

Degree:Master

Type:Thesis

Country:China

Candidate:Y H Zeng

Full Text:PDF

GTID:2428330596994979

Subject:Circuits and Systems

Abstract/Summary:

PDF Full Text Request

With the development of Internet and Moore's law,deep learning develops rapidly due to the convenience of data access and increasing computing power of hardware.At the same time,object detection technology achieves a great improvement with the progress of deep learning.Object detection has a wide range of application scenarios,including surveillance system and merchandise recognition on the Internet which are both computed in the clouds,and real time object detection and map building in embedded equipment.Advanced object detection technology are mostly based on computational-intensive deep learning algorithm,which poses a challenge to resourcelimited embedded equipment.Considering data security problem during communication between embedded equipment and server,embedded equipment should be able to process object detection algorithm locally.However,most of embedded equipment are resource-limited and are not designed to deal with CNN(Convolutional Neural Network).Therefore,it is significant to study and design a deep learning based object detection chip.The hardware architecture,performance,hardware utilization,power and DRAM accesses will be discussed in this paper,and a deep learning based object detection chip is designed and implemented.The following will be focused on in this paper:(1)The characteristic of CNN will be analyzed,and CNN hardware accelerator and storage architecture will be designed and implemented in this paper.Hybrid data reuse pattern is supported to reduce DRAM accesses,which lowers the system power.High computational parallelism is exploited in CNN hardware accelerator because processing element matrix is able to compute 2-d convolution effectively.The register matrix layer combines convolution,batch normalization,activation function in convolution layer and pooling in pooling layer,which can enhance data reuse and accelerate the process between convolution layer and pooling layer.(2)An object detection system based on YOLOv2-tiny with CNN hardware accelerator above is proposed.System test based on fixed point format is completed,meanwhile,hardware/software co-design is applied to partition the whole system trying to take computational advantage from different kinds of hardware.Besides,a detailed hardware architecture and performance analysis about pre-process,post-process and video stream in detection system is given in this paper.(3)Except for running a functional simulation on the CNN hardware accelerator,the whole process of ASIC back-end design with DC and ICC is given in this paper,and its power,area and timing report is analyzed.(4)YOLOv2-tiny is chosen as Benchmark in this paper and Xilinx FPGA is selected as the design and simulation platform.From the simulation and implementation result in Vivado,this architecture can achieve 9.06 GMACs at 100 MHz,while data precision is 32-bit and 16-bit fixed point,and system power is 6.525 W.In addition,detection system proposed can achieve a processing rate of 3.63 fps theoretically.As the timing reports in DC and ICC show,CNN hardware accelerator can run at 100 MHz after back-end design,while its chip area is only 3.5mm×3.5mm and consumes 204 mW in power.

Keywords/Search Tags:

Hardware Acceleration, YOLO(You Only Look Once), Software/Hardware Co-Design, FPGA(Field Programmable Gate Array), Back-End Design

PDF Full Text Request

Related items

1	Research Of Hardware Acceleration Technique For Critical Algorithms Of DSP Applications
2	Software/Hardware Co-Design And Implementation Of 802.11 MAC Protocol Based On FPGA Technology
3	Mixed Hardware/Software Accelerator-Centric Heterogeneous Architectures
4	Design And Implementation Of The Software & Hardware Of A Control Board For The Atm Prototype Switch Onboard
5	Design And Implementation Of The Software & Hardware Of A Control Board For The ATM Prototype Switch Onboard
6	Hardware/Software Co-verification Solution Based On FPGA And ISS
7	Hardware Acceleration Design For Inertial Navigation Algorithm
8	Research Of Hardware Acceleration Technique For Image Matching Applications
9	Research And Design Of Convolutional Neural Network Accelerator Based On Multi-FPGA Co-acceleration
10	The Software Hardware Co-Accelerating Of GCN Action Recognition Model Based On FPGA