Font Size: a A A

Research On The Design Of Convolutional Neural Network Hardware Acceleration System In Deep Learning

Posted on:2020-07-07Degree:MasterType:Thesis
Country:ChinaCandidate:K WangFull Text:PDF
GTID:2438330596973156Subject:Circuits and Systems
Abstract/Summary:PDF Full Text Request
In recent years,with the new machine learning boom brought about by deep learning,deep neural networks have been widely used in different large-scale machine learning problems such as image recognition,image classification,target detection and natural language processing,and have been obtained.A series of breakthrough experimental results and practical applications,today’s deep learning of powerful feature learning ability and recognition and classification capabilities have caused extensive research and attention.However,the convolutional neural network model in deep learning usually has the characteristics of high depth,complex hierarchy,large order of magnitude,high degree of parallelism,computational intensiveness and storage intensiveness,which makes a large number of convolution calculation operations and pooled computation operations in specific applications.It has become a huge bottleneck,and the storage of a large number of inter-layer calculation results also puts high demands on the storage structure of the computer,making it face enormous challenges in real-time application scenarios.As a highly intensive computing acceleration device,Field-Programmable Gate Array(FPGA)contains a large number of programmable logic resources,storage resources and computing resources,which can fully utilize the parallel characteristics of the convolutional neural network structure.The high-speed operation of the convolutional neural network can be completed under the constraints of small size and low power consumption,and is an ideal platform for realizing convolutional neural network operations.In this thesis,the hardware acceleration system design research is carried out for the image recognition task in deep learning.According to the structural characteristics of the convolutional neural network,the convolutional neural network is hardened on the FPGA based on ZYNQ series chip.The parallel computing feature and pipeline technology reduce the computation time of the convolutional neural network,thus achieving the hardware acceleration of the convolutional neural network.At the same time,in order to meet the application requirements of image recognition in real-time scenarios,this paper designs a real-time recognition hardware.The system framework adopts the software and hardware coordination method,and uses the ARM of ZYNQ series chip to complete the real-time acquisition,storage and display of the input image data,and transmits the collected data to the hardened convolutional neural network in the FPGA through the AXI4 bus.Real-time recognition of images,and the system framework can replace different hardened convolutional neural network models to meet real-time identification task requirements in multiple scenarios.The experimental results show that the hardened convolutional neural network model designed in this paper can complete 528 convolution operations in a single clock cycle,which is significantly improved compared with the general CPU.After the 11-bit fixed-point quantization of the weight parameters The recognition rate of the network is 97.14%,which has a high accuracy rate.The real-time identification hardware system framework designed in this paper can realize the real-time recognition of the captured image of the camera.At the same time,combined with the highly modular design of ZYNQ device,the whole system framework has the transplant.High performance and low power consumption required for overall system operation.
Keywords/Search Tags:Deep learning, CNN, hardware speedup, Real-time recognition
PDF Full Text Request
Related items