| This work focuses on the design of low power but high speed data and instruction Caches for 32 bit imbedded micro-processors.The author makes a detailed analysis on the importance of Cache design in low power processors and its main design considerations,reveals the necessity and feasibility of low power high efficiency Cache design.Finally,data and instruction Caches with low power and high efficiency are designed and verified.From the top level design,the author proposes a two-phase Tag comparison architecture based on the observation that instruction Cache has better tolerance on data delay.This architecture is proved to reduce power consumption of instruction Cache as well as raise working frequency.Data stability and writability of SRAM cell is improved by dynamic supply control scheme.The whole circuit design contains digital design part and full custom design part. The digital interface design aims at high hit rate and low miss penalty,and a LFU replacement policy featured by the FSM architecture is adopted,which improves the hit rate dramatically with reasonable cost.The FB pre-fetch and two-level write buffer techniques are utilized to reduce the waiting time at miss operations.FPGA verification is completed on the digital Cache RTL model.The target of full custom design part is low pwer consumption and high speed at hit operations,the latest low power design technique in SRAM circuit is combined with existing low power design methods.The author adopts sporadic pre-charge read strategy and charge-recycling write technique to reduce sequential read and write operation power individually.The pre-charge read strategy,which to the best of our knowledge is brought up for the first time,reduces consequentional read operation power dramatically.The charge-recycling write technique,which is also the first time to be applied in Cache design as far as the author knows,helps with low power sequentional write operations.Power supply control techniques such as column based power supply,floating supply in write,volTage bias also make their contributions to stable and low power Cache operations.As for the high speed respect,improvements are made on critical circuits such as the decoder,driver,self-timing circuit,etc.A sense amplifier which suits this application is chosen.The critical path of this design is shortened as a result,and it can work at a higher frequency as compared to existing design. At the end of this thesis,the 8KB four-way set-associative I-Cache and D-Cache modules are simulated and verified under SMIC 0.18μm CMOS technology.The results are listed and compared with existing set-associative Cache. |