Accelerating discontinuous Galerkin method and finite difference method by using multiple GPUs with CUDA

Posted on:2016-11-20

Degree:Ph.D

Type:Dissertation

University:University of Wyoming

Candidate:Mu, Dawei

Full Text:PDF

GTID:1470390017477363

Subject:Geophysics

Abstract/Summary:

Accurate and efficient computer simulations of seismic wave propagation in realistic three-dimensional geological media are becoming increasingly important in seismology for improving our understanding of the earthquake rupture process that generates seismic waves and the geological medium through which seismic waves propagate. However, the accurate and computationally efficient numerical solution of the three-dimensional (visco)elastic seismic wave equation is still a very challenging task, especially when the material properties are complex and the modeling geometry, such as surface topography and subsurface fault structures, is highly irregular.;We have successfully ported two different numerical methods for solving the three-dimensional elastic seismic wave equation from CPU platform to GPU platform. The first one is arbitrary high-order discontinuous Galerkin (ADER-DG) method which was designed for solving the three-dimensional elastic seismic wave equation on unstructured tetrahedral meshes. This ADER-DG implementation obtained a speedup factor of about 24.3 for the single-precision version of our GPU code and a speedup factor of about 12.8 for the double-precision version of our GPU code when compared with the serial CPU code running on one Intel Xeon W5880 core. By implementing the MPI technique and other optimization scheme, we further improved our ADER-DG code with parallelism capability which obtained a speedup factor of about 28.3 for the single-precision version of our codes and a speedup factor of about 14.9 for the double-precision version. To effectively overlap inter-process communication with computation, we separate the elements on each sub-domain into inner and outer elements and complete the computation on outer elements and fill the MPI buffer first. While the MPI messages travel across the network, the GPU performs computation on inner elements and all other calculations that do not use information of outer elements from neighboring sub-domains. A significant portion of the speedup also comes from a customized matrix-matrix multiplication kernel, which is used extensively throughout our program. Preliminary performance analysis on our parallel GPU codes shows favorable strong and weak scalabilities. The second numerical method we ported is fourth order finite difference method. Within this implementation, we utilized the staggered grid, dual layer mesh grid, classical Perfect Match Layer (PML) and many GPU optimize technique to enhance the efficiency of our code. Compared with the double precision CPU code, our finite-difference implementation obtained a speedup factor of about 62 for the single-precision version of our GPU code and a speedup factor of about 31 for the double-precision version of our GPU code when compared with the serial CPU code running on one Intel Xeon W5880 core.

Keywords/Search Tags:

GPU, CPU code, Seismic wave, Method, Double-precision version, Speedup factor, Three-dimensional

Related items

1	The Research On Seismic Wave Propagation And Scattering Of Three-dimensional Poroelastic Sites With Single-porosity And Double-porosity Medium
2	Three-dimensional Numerical Simulation Study To The Seismic Advanced Prediction
3	Research On The High Precision Forward Of 3D Wave Equations
4	Common Focus Point (CFP) Migration And AVP Analysis Based On The Wave Theory
5	Slope Stability Analysis Of Silt Particle Flow Code Based On Strengh Double Reduction Method
6	Three Questions About Quasi-cyclic Codes And Constacyclic Codes
7	The Resarch Of Precision Optimization Based On GPS Positioning With Code Pseudorange
8	Seismic Quality Factor Analysis Of Rock Mass Based On Discontinuous Numerical Method
9	Dimensional Seismic Data In Fault Identification Method
10	The Key Technology Research Of High-precision GNSS Base Line Processing