| In recent years,with the popularization of smart phones,the promotion of the IOT(Internet of things)and the commercial use of 5G network,the power consumption of integrated circuits have been put forward with increasingly high requirements.The status of low power consumption in IC design is constantly improving.As the scale of integrated circuits increases,it becomes increasingly difficult for a single company to design all the modules in a chip.Most companies buy partially IP cores and integrate them into their chips.Therefore,how to reduce the power consumption after physical implementation under the same netlist is a problem that every designer should consider.In this paper,the low-power design and physical implementation of a GPU module are carried out based on the TSMC 12 nm process.Then,optimizes the deficiency in the physical implementation flow to further reduce the power consumption after the physical implementation of GPU module.The main work of this paper is as follows:(1)The low-power design and physical implementation of GPU module are completed.The low-power design and physical implementation of GPU module are carried out by using low-power optimization technologies such as multi-voltage domain,clock-gating,power switch and multi-threshold device.The shortcomings in the physical implementation flow are analyzed to provide direction for further optimization.(2)Optimize the layout of macro unit.The layout of Macro units is the basis of physical implementation.A good macro unit layout can accelerate timing convergence and reduce the area and power consumption after the physical implementation.On the basis of previous GPU projects,this paper puts forward suggestions for improvement of macro unit layout and improved macro unit layout results.(3)Optimize the strategy of CTS(Clock tree synthesis).This paper uses the CCD(Concurrent clock & data flow)technology of Synopsys.Compared with the traditional CTS strategy,CCD can improve the timing after CTS and optimization,reduce the number of timing violations,and then reduce the number of low threshold and short channel devices used to repair timing violations in subsequent stages,so as to reduce the power consumption.(4)Optimize the process of fixing timing violations.Write scripts to optimize the timing repair process,use small power consumption to fix some special timing violations in priority,and reduce the number of timing violations need be fixed.Then use a mix of ICE and Prime Time for timing repair,further improve the effect of timing repair and reduce the increase in power consumption during fixing timing.(5)Using the optimized process,the GPU module is physically implemented again,and the results are compared with the previous results.It is found that: 1)After completing timing and physical signoff,the proportion of low Vt and short channel devices in sub-modules of GPU module is reduced.2)The design objectives of maximum static power consumption of not more than 30 m W and maximum dynamic power consumption of not more than 600 m W were met.3)Based on this paper optimization process,the GPU module achieves better results.Compared with before optimization,the static power consumption is reduced by 14.72%,the dynamic power consumption is reduced by 4.93%,the total power consumption is reduced by 5.4%,and meet process optimization goals.4)The optimization of GPU module physical implementation flow has a certain universality,and it also has a reference significance to the physical implementation of other modules. |