| Due to its high precision,deep neural network has attracted much attention in the field of intelligent computing.However,the high precision of deep neural network comes at the cost of huge amount of parameters and computation,which hinders the application of large scale neural network in intelligent hardware platform with limited storage space,energy and computing power.Theoretically,the pruning technique of neural network can greatly reduce the data scale and computation amount of deep neural network.However,the irregularities of the data in sparse neural networks cause the existing hardware platforms to implement sparse neural networks facing two challenges.First,the efficiency of data access is low.Second,the decoding complexity of sparse neural networks is relatively high,and the decoding process adds a lot of extra calculations.On the other hand,at present,there are many kinds of neural network algorithms,and new algorithms are constantly proposed.The domain-specific SoC platform built by CPU+FPGA accelerator has the characteristics of flexibility and high performance at the same time.It is a good choice to run neural network algorithm using this kind of platform.Rocket Chip,an open source SoC generator based on RISC-V instruction set,supports the realization of various forms of SoC,which provides a good platform for research.Based on this,this paper will design and implement RISC-V architecture domain-specific SoC for the application of sparse neural network algorithm.In order to make the sparse neural network algorithm run efficiently in the SoC platform,this paper studies the weight storage method of the sparse neural network and the design of the sparse neural network accelerator.Aiming at the problem of low data access efficiency in sparse neural networks,this paper proposes a method for dynamic ELL coding to compress and store sparse weights,and according to the algorithm characteristics of neural networks,based on the dynamic ELL sparse weight coding to compress storage weights and directly store weights.A strategy of mixed storage weights is proposed.Running the CNN using this storage strategy in a RISC-V SoC equipped with a general neural network coprocessor,compared to running the CNN that directly stores the weights on the target platform,the overall performance of the system is significantly improved.And as the sparsity of CNN increases,the system performance of CNN running on the target platform with mixed storage weights will increase more.To solve the problem of high decoding complexity of sparse neural networks,this paper designs a sparse vector inner product coprocessor that can screen effective neurons.In RISC-V SoC,the CPU calls this coprocessor to perform vector inner product calculation in the sparse full connection layer.Compared with the CPU calls the general vector inner product coprocessor execution algorithm,the adoption of the sparse vector inner product coprocessor has an obvious acceleration effect.And the larger the scale of the fully connected layer in the sparse neural network,the higher the sparsity,and the more the sparse vector inner product coprocessor improves the overall performance of the system.In general,this research has solved the problems of sparse neural network in the execution of hardware platform to a certain extent.The sparse neural network application method mentioned in the subject helps to deploy larger-scale neural network algorithms on hardware platforms. |