Font Size: a A A

Design And Implementation Of Convolution Neural Network Acceleration Based On FPGA

Posted on:2024-04-04Degree:MasterType:Thesis
Country:ChinaCandidate:C X NiuFull Text:PDF
GTID:2568307088463554Subject:Mechanical and electrical engineering
Abstract/Summary:PDF Full Text Request
In recent years,thanks to the exponential increase of computing power,deep learning has also been widely applied.Technologies such as face recognition,speech recognition and autonomous driving are changing people’s lifestyles.However,with the development of convolutional neural networks,their computation and storage requirements are increasing.In some low-power application scenarios,such as on-board AI computing,remote sensing image on-orbit processing,etc.,it is difficult to deploy them on conventional hardware platforms.The common hardware acceleration platforms mainly include GPU,ASIC and FPGA.Among them,GPU has high power consumption,ASIC is expensive and lacks flexibility,while FPGA has low power consumption and good parallel processing performance.Moreover,it can modify its structure to adapt to different computing scenarios.Therefore,it is more suitable for applying convolutional neural networks to low-power environments such as on-board intelligent computing and on-orbit target recognition.Due to the limited resources on FPGA chips,network parameters must be transferred through off-chip memory data movement.At the same time,while taking advantage of FPGA’s high parallelism computing characteristics,it is necessary to balance the design requirements of efficient computing.Therefore,designing a reasonable hardware architecture and data path becomes a necessary task for implementing FPGA-based convolutional neural network acceleration.The main work of this paper is as follows:Firstly,we analyze the characteristics of convolutional neural networks and study hardware acceleration design schemes for convolution layers which concentrate computation.We analyze the relationship between data bandwidth and network computation parallelism and design a pipeline computation array based on Winograd algorithm.We slice and rearrange all feature data and use pipeline multi-stage parallel computation characteristics to achieve higher DSP computation efficiency.Then,we use Vivado software for simulation verification.Secondly,based on pipeline convolution array we design a convolutional neural network accelerator.We study quantization strategies for convolutional neural networks and design an 8-bit fixed-point quantization scheme for hardware to compress network model size.We divide accelerator functions into multiple modules:we design register control module which controls accelerator operation by inputting parameters;in order to improve data reuse rate we design high-throughput feature data cache module and weight data cache module which improve data transmission efficiency by using serial-in-parallel-out overlapping data reuse ping-pong cache etc.,achieving row-column data reuse for feature data;based on addition tree we design channel accumulation output cache module which realizes seamless calculation output avoiding calculation blocking due to data output;we design pooling module fully connected layer module using layer fusion design idea reducing inter-layer time loss.Finally,for the NWPU-RESISC45 dataset,using the modified VGG-16 network,the accelerator is deployed on the Xilinx ZCU104 development board to perform remote sensing image classification experiments.The experimental results show that at a clock frequency of 200 MHz,the computation time of one inference is 124.5ms,the convolutional layer computation performance reaches 354.5GOPS,the overall acceleration performance reaches 248.4GOPS,and the computational efficiency reaches 0.48GOPS/DSP,which has more than 1.8x improvement compared with other designs.The accelerator can complete remote sensing image classification tasks based on VGG-16 network with high energy efficiency ratio.
Keywords/Search Tags:Convolution neural network, FPGA, Winograd algorithm, pipeline, parallel computing
PDF Full Text Request
Related items