Font Size: a A A

The Evaluation Of Portable Performance For OpenACC 2.0

Posted on:2016-06-25Degree:MasterType:Thesis
Country:ChinaCandidate:Suttinee SawadsitangFull Text:PDF
GTID:2308330476453352Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Previous generation, the computer architecture relied on increasing clock frequency. To improved code performance, programmers were able to ride this trend and did not have to make significant code changes. Until this concept had hit the power wall, the trend has changed to increase the use of accelerators. Nowadays, accelerators play a major role in high-performance computing. A convenient way to take advantage of accelerators is to use a high-level programming model,such as OpenACC. The OpenACC compiler allows a programmer to execute a code across multiple platforms also known as write once, run anywhere.Top supercomputers, such as Tianhe-2, Titan, and TSUBAME 2.5, heavily use accelerators.However, application developers need to write di?erent versions, CUDA, and OpenMP, for these supercomputers. OpenACC allows programmers to write a single source code and execute across supercomputers while OpenACC remains as a directive based approach, unlike the low-level approach such as OpenCL.We studied the performance portability of OpenACC on NVIDIA Kepler GPU and Intel Knights Corner by evaluating four kernel benchmarks from Rodinia benchmark suite and one mini-application called Hydro with CAPS and PGI compilers. We also analyzed the Parallel Thread Execution(PTX) code to explain the results. The results show the traditional programming approach can narrow down the performance gap between OpenACC and OpenCL to less than 75% on both accelerators by PGI compiler and 53% by CAPS compiler. Moreover, the results showed OpenACC could achieve even better performance portability ratios in some cases. If the Ninja or ultimate performance was not the goal, this paper confirmed those low-level programming models are not always necessary.
Keywords/Search Tags:OpenACC, OpenCL, Performance Portability, GPU, Xeon Phi
PDF Full Text Request
Related items