Design Of Energy-Efficient Object Detection System Based On FPGAs

Posted on:2024-03-14

Degree:Master

Type:Thesis

Country:China

Candidate:Z H Wang

Full Text:PDF

GTID:2568307058982219

Subject:Master of Electronic Information (Professional Degree)

Abstract/Summary:

PDF Full Text Request

Object detection technology has become a popular research direction in the field of edge artificial intelligence in recent years,and it has a wide range of applications in intelligent security,automatic driving,industrial quality inspection,smart home and other fields.With the development of edge computing,more and more application scenarios need to perform object detection tasks on resource-constrained edge devices.However,the computational resource overhead brought by convolutional neural networks while ensuring high precision makes the design and optimization of object detection algorithms on edge devices face higher requirements.In order to meet the needs of edge intelligence scenarios,researchers have proposed a series of lightweight object detection algorithms,including Mobile Net-SSDv2,Tiny-YOLO,Ultra＿net,Skynet and Shuffle Det,etc.In addition,in order to improve the performance of object detection algorithms on edge devices,researchers have also proposed many optimization methods,such as network pruning,network quantization,network distillation,etc.Field Programmable Gate Array(FPGA)has attracted many researchers to explore how to deploy object detection algorithms more efficiently because of its high parallelism,good data locality,and reconfigurability.At present,researchers are conducting research from the perspectives of algorithm design and optimization,hardware architecture design,resource management and scheduling,system integration and optimization,and application scenario optimization to further optimize its performance.In view of the current situation,this thesis studies the object detection algorithm based on deep learning and the deep learning network accelerator based on FPGA,and proposes a high-efficiency object detection system based on FPGA.The specific research includes the design of image sensor-FPGA direct object detection architecture,low-latency neural network model training and FPGA lowlatency resource allocation strategy.This thesis implements a hardware prototype on Xilinx’s FPGA platform,and verifies its superiority in object detection performance and energy efficiency through experiments.The main work of this thesis is as follows:(1)The Design of Open-Channel Object Detection ArchitectureThis thesis proposes a new open-channel object detection architecture,which connects the image sensor with FPGA directly,avoiding the energy consumption of extra chip external memory and the static power consumption of invalid logic in the traditional architecture.At the same time,the image access module,image preprocessing module and object detection module are deployed on FPGA,and a flexible low delay and high bandwidth on-chip bus is designed to interconnect these logics.Compared with the traditional image sensor-image processing unit-processor-FPGA object detection architecture,the proposed architecture has higher efficiency and lower energy consumption.The network structure design of the object detection module refers to Ultra＿net,and then optimizes the training and deployment and proposes a object detection system based on PYNQ-Z2.(2)The Low-delay Neural Network Model TrainingThis thesis optimizes the real-time performance of the proposed open-channel object detection architecture.First,the image sensor is used to directly access RAW DATA images to FPGA,which omits the delay and resource overhead caused by the demosaic operation to convert RAW DATA into RGB images.Second,in order to reduce the storage space and computational complexity of the model,this thesis adopts the low bit quantization technology to compress the model.In order to improve the detection efficiency and not reduce the detection accuracy,Bayer format image training and quantitative perception training are used in the process of network model training.(3)Low latency FPGA Resource Allocation StrategyTo achieve the maximum overall throughput and lowest latency of the open-channel object detection architecture,this thesis proposes a resource allocation strategy based on a parallel granularity and resource usage model.This strategy establishes a novel relationships model to deeply investigates the parallel granularity between channel and convolutional core,which further optimize the on-chip resources and latency of FPGA.Under the condition of satisfying resource constraints,the goal is to achieve consistent delay of each layer,maximum overall throughput,and lowest delay.This strategy optimizes the use of logic resources,resulting in better resource utilization and latency control.(4)The Object Detection System Based on PYNQ-Z2In order to verify the open-channel object detection architecture proposed in this thesis,the Pynq＿net network is implemented on Xilinx PYNQ-Z2 FPGA development board.A prototype system with high performance and low power consumption is realized through the efficient cooperation between the ARM processor core on PYNQ-Z2 platform and the functional units on FPGA.Among them,the customized Pynq＿net network IOU reaches 0.876,and the throughput of the backbone network is 58.5GOPS when the working frequency is 91 MHz.The power consumption of the whole system,i.e.from the camera readout to object detection network output is 2.78 W,the detection speed is 45 FPS,and the energy efficiency ratio is 0.06J/Pic.

Keywords/Search Tags:

Object Detection, Hardware Accelerator, Field Programmable Gate Array, Raw Data, Open-Channel Object Detection Architecture

PDF Full Text Request

Related items

1	Implementation Of Moving Object Detection Base On FPGA
2	Research And Implementation Of Object Detection Acceleration Method Based On FPGA
3	Fast Drivable Area Detection Method Based On Heterogeneous Computing
4	The Design And Implementation Of Object Detection Chip Based On Deep Learning Algorithms
5	Research And FPGA Realization Of Real-time Motion Video Sequences To Achieve Target Detection Method
6	Design And Optimization Of Tiny YOLO Convolutional Neural Network Accelerator
7	SRAM Field Programmable Gate Array Design And Test Analysis
8	Designing And Implementation Of High Performance Architecture For Hardware Accelerated CLAMAV Software
9	The Software Hardware Co-Accelerating Of GCN Action Recognition Model Based On FPGA
10	Mixed Hardware/Software Accelerator-Centric Heterogeneous Architectures