Font Size: a A A

Research And Design Of Convolutional Neural Network Accelerator Based On Heterogeneous SOC

Posted on:2024-03-03Degree:MasterType:Thesis
Country:ChinaCandidate:Q ShenFull Text:PDF
GTID:2568307151466144Subject:Electronic information
Abstract/Summary:PDF Full Text Request
In recent years,with the continuous development of deep learning technology,the accuracy of convolutional neural networks has been further improved.This brings the thinking about the implementation of artificial intelligence algorithm in embedded terminal.The traditional network implementation process is based on CPU and GPU architecture mode,which brings high computing power but also faces the problem of high power consumption.The custom chip achieves high computing performance and low power consumption,but with the continuous introduction of new networks,the custom chip cannot reconfigure the chip function,resulting in high iteration cost.The System On Chip(SOC)mode uses CPU+DPU dual cores to achieve high computing power and low power consumption while realizing functional reconfiguration,meeting algorithm requirements in different scenarios.Based on this,a convolutional neural network accelerator based on heterogeneous SOC is designed in this paper.Firstly,the basic principle of the convolutional neural network accelerated by the accelerator is expounded,and the components of the network are explained by modules.Then,the optimization level of convolutional neural network accelerator in the current research is explained,and at the hardware level,the cyclic expansion method is selected to adjust the parallelism degree of operation and improve the arrangement of data in the on-chip and off-chip storage units.At the network level,the data bit width is reduced and the data volume is compressed by selecting the post-training quantization and the quantization during training.Finally,the front-end part of SOC design process is explained.On this basis,the system architecture design and submodule circuit design of heterogeneous SOC accelerator are carried out.HDL language Verilog was used to construct the circuit of each module,including the on-chip AMBA bus module,DMA data transmission module,operation unit module,etc.,and the architecture of each module was built by instantiation.Then C++ language is used to design the system scheduling program in the CPU.Finally,the System Verilog-based simulation verification and FPGA prototype verification are carried out.First,the System Verilog language is introduced,and then a test platform is built for the module to be tested to verify the timing and function of key circuits.In the aspect of data quantization,the Pytorch framework based on Python is used to train and quantify the measured network.Finally,the architecture is transplanted to FPGA for functional verification and analysis of the performance of the proposed architecture.By comparing with different designs of CPU and the same architecture,it is shown that the convolutional neural network accelerator designed in this paper has certain advantages in computing speed and accuracy.
Keywords/Search Tags:Convolutional neural network, Convolutional neural network accelerator, Heterogeneous SOC, Floating-point quantization
PDF Full Text Request
Related items