Research On Dynamic Quantization Algorithm Of Convolutional Neural Networks And Its Parallel Computing Structure

Posted on:2020-12-14

Degree:Master

Type:Thesis

Country:China

Candidate:X X Zhou

Full Text:PDF

GTID:2428330599961766

Subject:Control Science and Engineering

Abstract/Summary:

PDF Full Text Request

Compared with traditional methods,the performance of neural network-based methods is usually much better than that of traditional algorithms in the domain field.And it has been applied in many fields,such as speech recognition,target detection,target segmentation and so on.However,the computational complexity of network-based methods is usually very large,which limits the application of neural network methods in embedded scenarios,such as VR/AR,mobile phones,smart security,automatic driving and so on.In order to solve this problem,this thesis aims to explore an engineering feasible and embedded platform-oriented convolution network deployment scheme,and verify it with FPGA,which provides system-level support for building a convolution network acceleration based on FPGA.Taking YOLO algorithm as the representative,this thesis summarizes the computational characteristics of convolution network,and demonstrates the parallelizability of using FPGA to implement convolution network algorithm.Considering that the computational process of FPGA needs to use fixed-point calculation,which conflicts with the floating-point operation process of existing network,an integer quantization method for convolutional networks is improved.This method uses statistical extremum to dynamically quantify the input,output and weight of the network,ensuring that the network forward derivation uses integer calculation only and solves the numerical calculation conflicts on the FPGA.It is worth mentioning that the improved convolutional network quantization method simplifies the inference of convolutional network while ensuring that the network accuracy loss is less than 2%.Then,based on HLS,the hardware acceleration architecture of YOLO algorithm is designed.The architecture of the system consists of two parts: parallel inference of convolution network with programmable logic and the data scheduling with ARM.We had designed a general platform based on Xilinx ZC706 evaluation board,which is used for algorithm test and verification of the YOLO acceleration architecture.The hardware acceleration architecture of the YOLO algorithm designed in this thesis is 19 times faster than the CPU.Finally,the performance of the network quantization method improved in this thesis is analyzed,and the loss of network performance before and after quantization is compared.At the same time,it shows the resource usage of the hardware implementation,and also compares and analyzes the performance of the YOLO algorithm implemented on the FPGA.At the end of the thesis,the shortcomings of this thesis and the direction of further research are pointed out.

Keywords/Search Tags:

YOLO, HLS, FPGA, Network Quantization, Hardware Acceleration

PDF Full Text Request

Related items

1	Research And Design Of YOLO V2 Neural Network Accelerator Based On FPGA
2	Research On Edge Oriented FPGA Software Hardware Collaborative Convolutional Network Acceleration
3	Design Of YOLOv3-Tiny Algorithm Based On FPGA
4	Research On The Acceleration Of Tiny-yolo Convolution Neural Network Based On HLS
5	The Design And Implementation Of Object Detection Chip Based On Deep Learning Algorithms
6	The Design And Implementation Of Dynamic Object Recognition System Based On FPGA
7	Quantization And Hardware Acceleration For Deep Neural Network
8	Research And Implementation Of Convolutional Neural Network Acceleration Method Based On FPGA
9	Research And Implementation Of Object Detection Acceleration Method Based On FPGA
10	Research On CNN Network Acceleration For Image Classification Based On FPGA