Font Size: a A A

Temperature and Cooling Management in Computing Systems

Posted on:2012-05-17Degree:Ph.DType:Thesis
University:University of California, San DiegoCandidate:Ayoub, RaidFull Text:PDF
GTID:2452390011452962Subject:Computer Engineering
Abstract/Summary:
Temperature and cooling are critical aspects of design in today's and future computing systems. High temperature has a significant impact on reliability, performance, leakage power and cooling energy costs. State of the art temperature management techniques come with performance overhead and do not optimize for cooling energy costs. Energy management techniques usually focus on optimizing the computing energy without considering the impact on temperature or cooling system. In general, managing temperature, cooling and energy separately leads to suboptimal solutions. In this thesis we introduce a new hierarchical approach that manages the temperature, cooling and energy problems jointly and with low overhead. Our approach addresses microarchitecture, core, socket and system levels.;At the microarchitectural level we achieve temperature and energy optimizations by eliminating the redundant writes to the register file at minimal performance overhead. The experimental results show that our technique is able to achieve on average 22% energy savings in register file with 4°C reduction in temperature. We next introduce a novel core level proactive thermal management technique that intelligently allocates jobs across cores of a single CPU socket to create a better thermal balance across the chip. We introduce a novel temperature predictor that is based on the band limited property of the temperature frequency spectrum where the prediction coefficients can be identified accurately at design time. Our results show that applying our algorithm considerably reduces the aver- age system temperature, hottest core temperature, and improves performance by 6°C, 8°C and 72% respectively. At the CPU socket level, we propose a new algorithm which schedules the workload between sockets to minimize cooling energy by creating a better balance in temperature between the sockets. The reported results show that combining the socket level with the core level optimizations can result in cooling energy savings of 80% on average at performance overhead of less than 1%. Finally, we describe a combined temperature, cooling and energy management approach that significantly lowers the cooling energy costs of the system as well as the operational energy of memory. We introduce a comprehensive thermal and cooling model which is used for online decisions. This technique clusters the memory accesses to subset of memory modules in tandem with balancing the temperature between and within the CPU sockets. The experimental results show that our method delivers an average cooling and memory energy savings of up to 70% compared to the state of the art techniques at performance overhead of less than 1%.
Keywords/Search Tags:Temperature, Cooling, System, Computing, Performance overhead, Management, Energy, Results show
Related items