| As the demand for deploying artificial intelligence algorithms on the edge devices increases,inference need to be completed with low power consumption on embedded devices.The "memory wall" problem becomes a major problem that is difficult to solve for the von Neumann architecture.Compute-in-memory(CIM)has been a promising technology to reduce the data movement energy and latency.Digital and analog are two main approaches of SRAM-based CIM.Digital approaches in CIM macro have many advantages compared with analog counterparts,such as programmability and inference precision.However,previous works with digital approaches generally employ complex SRAM bit-cells and computational components,which cause a large area overhead.This thesis proposes a new digital CIM macro,and based on this macro,a hybrid CIM system architecture with both digital and analog approaches is proposed.First,we briefly introduce the research background of SRAM-based CIM,and then sort out and analyze the current works.We also introduce the basic theory and design technique of SRAM and SRAM-based CIM.Then,we propose a new 8T SRAM bit-cell to reduce the overall area of the SRAM array,which is able to implement the 1-bit multiplication without read-disturb issue.In addition,we propose an interleaved adder tree,and a dual supply voltage strategy,which can significantly reduce the area and power consumption of parallel adder circuits.We also propose a result recombination circuit,which can improve the flexibility and accuracy of the CIM macro.A 16Kb SRAM CIM macro with proposed techniques is designed in a 40-nm CMOS technology.The simulation results show that our work achieves 820 GOPS throughput and 94 TOPS/W energy efficiency with 4-b of both input and weight.It achieves 1.3Ă—higher energy efficiency and 70%area reduction when compared to the recent state-of-the-art works.Finally,based on the digital CIM macro proposed in this thesis,we propose a hybrid CIM architecture and perform behavioral-level modeling.And functional simulation results show that the architecture work fine in 4 modes to cope with different application requirements. |