Font Size: a A A

Performance Analysis And Optimization Of 3D-EW On Intel Knights Corner

Posted on:2016-07-26Degree:MasterType:Thesis
Country:ChinaCandidate:Y C WangFull Text:PDF
GTID:2180330476453314Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The thesis describes how I use the latest accelerator and co-processor technology and do heterogeneous parallel computing to improve the performance of 3d-EW, a elastic wave equation modelling code, which is an important topic in geophysics. In the3D-EW(3D pure P and S wave elastic wave equation modelling) algorithm, so many iterations are generated that even nowdays elastic wave equation modelling by computer is still a time-comsuming problem. Meanwhile 3D-EW is compute-bound and friendly to parallel algorithm. With the development of accelerator and co-processor, heterogeneous parallel computing is considered as a method to improve its performance. The target of this article is to improve 3D-EW algorithm performance based on Intel Xeon Phi co-processor. The speedup of 3D-EW algorithm ported by OpenMP to Xeon Phi is 3.7x compared with performance on Xeon processor. After studying ”Knights Corner” architecture in order to get idea about design features and protential bottleneck of KNC, performance counters from profiling tools are used to check the low-level performance. The relationship between performance counters and optimization is built.During this work the methods of data level and insertuction level optimization on KNC are summarized with reference to optimization on CPU. Meanwhile the algorithm has been significantly improved using OpenMP and C intrinsic in order to take advantage of vectorization and cache blocking. After that the performance on the Intel 5110 P Xeon Phi co-processor reached 17.7x speedup comparing with Xeon processor. The results also show where potential bottlenecks of KNC is and which methods can improve them. The relationship between performance counters and optimization on KNC has been built.
Keywords/Search Tags:Xeon Phi, High Performance Computing, Performance Optimization
PDF Full Text Request
Related items