| With the continuous development of chip manufacturing process and coprocessor,more and more DSPs and on-chip memory resources are integrated into FPGA chip,which makes FPGA have great advantages in computing intensive hardware acceleration.As a typical computing intensive application,convolutional neural networks in deep neural networks have important guiding significance and application value in the fields of face recognition,image segmentation,and so on,and have been favored by the academic and industrial circles.However,at present,the implementation of convolutional neural networks by general processors can not fully exploit the parallelism within the network model.With the increasing demand for real-time and low power applications,more and more researchers begin to use FPGA to develop applications based on convolutional neural networks.As a network model with important historical significance in the field of convolutional neural networks,Alexnet not only proves the effectiveness of the convolutional neural network in complex model,but also uses GPU to make the training of large data within the acceptable time range.Therefore,it is of great significance to study the acceleration of Alexnet model for accelerating the convolutional neural network under complex model.Based on this background,in the full research and Analysis on the basis of current research,design and implementation of a Alexnet forward network accelerator based on FPGA,by optimizing the model structure,flow layer processing and improve network parallelism to improve the overall recognition rate of network.The main research work of this thesis includes:1.The main factors affecting the performance of Alexnet forward recognition network are studied,and the activation function and the pooling module in the network model are optimized and improved.The parallel computation in the network model is studied in this thesis.Secondly,the analysis of forward calculation process and multiplication calculation are given.Finally,the activation function and the calculation of the pool module are analyzed,and the network model is optimized by combining the activation function and the pooling module to calculate the maximum output value and the characteristics of FPGA.On the premise of ensuring the output unchanged,397428 comparative operations are reduced,accounting for 76.4%of the calculation before the optimization of the activation function and the pooling module.2.The optimization design of key modules of Alexnet forward network based on FPGA is completed.In this thesis,the design of the basic unit of the convolutional neural network and the two dimensional parallel acceleration are completed,and the speedup ratio of the two dimensional parallel acceleration is analyzed.Secondly,according to the amount of data output from each layer,the BRAM resources in the chip are allocated reasonably,thus the parallelism of each layer network is determined,and the design of each layer structure in the network is completed.3.The implementation and performance analysis of Alexnet forward network based on FPGA are completed.Based on the FPGA development platform,the overall framework design and implementation of the Alexnet forward network are completed,and the simulation files are verified,and the resource usage and design performance are analyzed.The specific computing time of each layer in the forward recognition network is given,the resource usage of the accelerator design is listed,and the recognition rate of the GPU and CPU is compared,and the correctness of the recognition results is verified. |