Design Of Smart City Traffic System Based On FPGA

Posted on:2024-05-13

Degree:Master

Type:Thesis

Country:China

Candidate:H T Cao

Full Text:PDF

GTID:2542306914488144

Subject:Energy power

Abstract/Summary:

PDF Full Text Request

In smart city traffic systems,vehicle flow is an important basic piece of information.Detecting traffic flow accurately and quickly and applying it to an embedded hardware platform is a crucial research topic in smart city traffic systems.In recent years,object detection algorithms based on convolutional neural networks(CNN)have become dominant in many applications due to their superior accuracy compared to traditional schemes.The YOLO(You Only Look Once)series of algorithms has become popular for their combination of speed and precision.However,as CNN recognition accuracy continues to increase,the computational complexity of CNNs also increases,limiting the theoretical algorithms at the hardware level.Currently,CNNs are mostly deployed on Central Processing Units(CPU)and Graphics Processing Units(GPU).However,CPU are mainly composed of memory units and control units,which are not suitable for accelerating deep learning models,while GPU have fast computation speeds but high energy consumption,making them difficult to apply to portable devices.Application-Specific Integrated Circuits(ASIC)can be designed to meet specific requirements,but they have complex design cycles,high initial investment costs,and a lack of reconfigurability.On the other hand,Field-Programmable Gate Arrays(FPGA)combine the advantages of GPU and ASIC by being reconfigurable,high-performance,and lowpower.In this paper,we use an FPGA to build a CNN hardware acceleration platform and accelerate and optimize the YOLOV2(You Only Look Once Version 2)algorithm,achieving the detection of traffic flow on an embedded hardware platform.The main work of this paper is as follows:(1)Optimizing the YOLOV2 network model.Firstly,the YOLOV2 network model is quantized to half-precision floating-point format,reducing its storage space requirements.After quantization,the model size is reduced by half.Secondly,the convolutional layers and BN(Batch Normalization)layers are fused together,improving the network’s inference speed without reducing accuracy.Finally,the Confluence algorithm is used as a postprocessing algorithm,introducing Manhattan distance simplification to obtain the process of all detection boxes under the same target,reducing the complexity of the post-processing algorithm.Through experimental comparison,the optimized YOLOV2 network model’s inference speed has been improved by 0.4s.(2)Designing hardware accelerator utilizing the parallel computing feature of FPGA.This paper adopts a top-down approach to modularize various network layers of YOLOV2,proposes a basic operation method for half-precision floating-point numbers to construct convolutional layers,pooling layers,and other network layers,and builds the YOLOV2 network model in an integrated way from the bottom up.Due to the limited on-chip storage space of FPGA that cannot accommodate all parameters of a layer and the lack of sufficient logic resources to complete all operations,this paper studies the loop unfolding and loop blocking of convolution based on the hardware architecture of FPGA and proposes a pipeline operation based on ping-pong buffering to achieve high-speed data flow by coordinating the input data selection unit and the output data selection unit according to the clock cycle.(3)Based on the XC7Z020 chip,a circuit board is designed for hardware acceleration of YOLOV2 to build a vehicle flow detection system aimed at detecting vehicle flow on the road.The system includes:a USB camera for obtaining real-time images of the road,an OTG interface for powering the circuit board;an SD card for storing the boot image and detection results,a network interface for remotely logging into the Linux system running on the internal PS of the XC7Z020 chip and transmitting the camera image,the XC7Z020 main control chip used for hardware acceleration of the YOLOV2 algorithm.The peripheral circuit is built around this chip to complete the system design.Finally,comparative tests are performed on the experimental results under different devices.The experimental results show that the vehicle flow detection system proposed in this paper takes 2.01 seconds to detect a single frame image,which is faster than the CPU’s single frame image detection time and has much lower power consumption than the GPU,which can effectively achieve the system’s functionality.

Keywords/Search Tags:

deep learning, target detection, YOLOV2, FPGA, hardware acceleration

PDF Full Text Request

Related items

1	Study On FPGA Acceleration Of Target Detection For Remote Sensing Images
2	Deep Learning Algorithm Lightweight For SAR Ship Detection And Corresponding FPGA Accelerator Design
3	Vehicle Object Detection Based On FPGA And Machine Vision
4	Research And Implementation Of Dynamic Reconfigurable Hardware Acceleration System For Multi-Object Detection Algorithm
5	Research On Multi-object Recognition Of Video Image Of Traction Substation Based On Deep Learning
6	Application Of FPGA In Target Detection Acceleration For Hyperspectral Images And Thz Programmable Intelligent Devices
7	Research On Moving Target Detection And Hardware Acceleration Of Small Airborne Platform
8	Research On Hardware Acceleration Method Of 3D Object Detection Algorithm Based On FPGA
9	Design Of A Visual Assistant System For Blind People Based On Deep Learning
10	Design Of Vehicle Target Detection System Based On FPGA And CN