High Performance Computing (HPC) is widely used in science and engineering to solve large scale computation problems. With the development of HPC many high performance computers are developed and used. The peak performance of computer increase in a continuous and rapid way. But the sustained performance achieved by real application do not increase in the same speed as the peak performances do. The gaps between the applications' sustained performance and the peak performance of machine are widening. We should pay attention to two facts in the rapid progress of high performance computing, one is that the peak performance of parallel computer is in fast progress and it has got the level of 100 Tflops, clusters with high performance/cost ratio has now become the main architecture and is adopted in more and more applications; At the same time, the sustained performance of parallel applications is lower than 20 % of the peak performance.Now measurement and analysis of performance data based on hardware performance monitoring counters is becoming a foundation of modern performance analysis. At the same time, with the aim to enable the users access these low-level hardware performance counters expediently and safely, many application... |