| As a basic operation of matrices,matrix multiplication is widely used for the implementation of Artificial Neural Networks(ANN).Affecting the efficiency of ANN implementations,the design of matrix multiplication on Field Programmable Gate Array(FPGA)platforms is widely examined as FPGA platforms are one of the common way to implement ANN.Currently,there lacks investigations of performance evaluation for matrix multiplication as most previous works focus on hardware design only.Comparing three proposed designs,this thesis discusses the possible suitable hardware implementations under various circumstances of Deep Neural Networks(DNN)based on the performance evaluation of the matrix multiplication implementations.As the basis of Forward Propagation(FP)of DNN,two hardware designs of a single layer are proposed based on Multiplier Adder(MAD)and Multiply Accumulator(MAC)respectively.Investigating the timing analysis of the proposed single layer architectures and the consumption of resources such as Digital Signal Processor(DSP),Look Up Table(LUT)and Register(REG),the performance of two proposed single layer architectures are compared.A multi-layer architectures of DNN computation can be obtained by cascading multiple single layer architectures.Implementing the same multi-layer structure of DNNs,the multi-layer architecture based on MAC outperforms the multi-layer architecture based on MAD due to timing analysis and resource consumption.As the relationship of resource consumption of multi-layer and single layer architectures fails to show linearity,the resource consumption of a multi-layer DNN are estimated according to a parameterized method that is obtained by analyzing the resource consumption in the case of implementing DNNs with different sizes.A compound hardware design combining MAD and MAC architectures are proposed for the DNNs with odd number of hidden layers.With hand written digit database(known as MNIST)and the physical implementation based on Xilinx VC707 development board,the performance of the compound design is confirmed as effective and accurate with 13%savings on resource consumption in terms of slices. |