| Brain-inspired computing is one of the major challenges of the century in both the academic world and the industrial world,and it has attracted much attention in many fields such as computer science,neuroscience and artificial intelligence.It is generally believed that there are two important branches of the brain-inspired computing.One is the Deep Neural Network(DNN)called the second-generation neural network,and the other is the Spiking Neural Network(SNN)known as the third-generation neural network.They work together to promote the arrival of the era of artificial intelligence revolution.With the rise of a series of intelligent applications,such as image recognition,semantic analysis and machine translation,traditional commercial hardware platforms have become increasingly inadequate.Therefore,researchers around the world have launched a variety of "BRAIN Initiatives" to challenge "building brains".The brain-inspired computing system draws on the information processing method of the human brain,and aims to realize a brand-new computing system with some or all the characteristics of the brain.To this end,this paper focuses on the key technologies of the brain-inspired computing model and its hardware acceleration.In this article,the brain-inspired computing is explored from the perspective of algorithms and hardware,which is committed to improving the information processing abilities.The main contents and results of this dissertation are summarized as follows.(1)Through the analysis of the computational characteristics of the quantized convolutional neural network(CNN)model and the features of the parallel streaming architecture,we propose a quantized CNN accelerator based on the streaming architecture.The optimization is carried out in many aspects such as dataflow,processing element(PE),memory access and balanced workload,giving full play to the computing power of this accelerator.The experimental results show that the performance of the accelerator can reach 163 frames/sec when accelerating AlexNet model,and the average PE utilization is 91.79%.(2)In order to expand the application scenarios of CNN accelerators and exploit the sparsity of CNN models,we propose an efficient sparse CNN accelerator,which can support several kinds of convolutions.This work creatively solves the mapping problem of multiple convolutions from the perspective of sparseness,which can reduce the control overhead.Besides,we propose an ineffectual data removing(IDR)mechanism which can filter both ineffectual activations and weights,in order to improve the performance brought by sparsity.Furthermore,a flexible layered load balance(LLB)mechanism is introduced to alleviate the load imbalance to further improve the utilization and throughput.The experimental results show that this work achieves the best energy efficiency(3.72 TOPS/W)among other comparative processors when accelerating Deep Convolutional Generative Adversarial Networks(DCGANs).(3)We demonstrate a converted SNN for image segmentation that is applied to a natural video dataset.Layer-specific normalization and interval reset with early stopping are applied in an effort to obtain low latency and high accuracy in SNN networks.The experimental results show that this work achieves 34.7x convergence speed up within the accepted range of accuracy drop.(4)Through the analysis of dynamic sparsity and spatio-temporal characteristics of SNNs,we propose an efficient event-based SNN accelerator.An reconfigurable spiking neuron processing unit with flexible instructions is introduced to support a variety of spike-layers in SNNs.Furthermore,an efficient dataflow targeting dynamic sparsity with the fast-filtering mechanism is proposed to achieve higher PE utilization,lower power consumption and lower latency.The experimental results show that this work achieves higher PE utilization(90.98%)and energy efficiency(5.04 TOPS/W)than other comparative processors. |