Research On The Mechanism Of Hypersonic Flow Based On GPU

Posted on:2013-09-09

Degree:Master

Type:Thesis

Country:China

Candidate:H H Gao

Full Text:PDF

GTID:2180330422474229

Subject:Mechanics

Abstract/Summary:

PDF Full Text Request

In recent years, CFD technology has made rapid progress. Its use has become anincreasingly widespread. Computing grid increased significantly, ranging frommillions to billions. The huge amount of computation bring the current CPU-coreparallel computing clusters enormous pressure, while the GPU platform is a goodsolution to this problem. Based on the CFD numerical simulation with CUDAplatform, this paper analyzes the characteristics of the optimization of it and thespeedup ratio acquired from CUDA platform for the different numerical schemes.In the basic introduction for NVIDIA’s CUDA platform, the stream processorarray, memory system, the kernel function and the thread structure are introduced andthe difference from CPU is pointed. The different parallel modes are listed here, andthe introduction is emphasized on “multiple instruction single thread”, the parallelmode of CUDA architecture. It analyzes the similarities and differences with otherparallel modes.Some classic examples, such as shock wave tube problem, moving shock/densitywave interference and double Mach reflection, using the5th order WENO scheme arecalculated on the CUDA architecture. Its implementation process is expounded here,including data structure, data communication and the execution of the kernel functionconfiguration. In one-dimensional problem, the accuracy of the CFD parallelalgorithm based on CUDA platform is compared with CPU’s and the emphasis is onthe analysis of its single precision algorithm and double precision algorithm based onCUDA. The speedup ratio of the single precision algorithm and double precisionalgorithm is analyzed here too. In two-dimensional problem, the analysis on the flowfield structure shows that the program based on CUDA is accurate, and it also showsthat both single and double precision in the CUDA algorithm can well depict the flowfield.Performance optimization of the two-dimensional WENO5program based onCUDA revolves around three basic strategies: maximize parallel execution to achievemaximum utilization; optimize memory usage to achieve maximum memorythroughput; optimize instruction usage to achieve maximum instruction throughput.Use stream and improve occupancy to maximum utilization; use coalescing tomaximize global memory throughput and use shared memory to maximum memorythroughput; minimize the use of the arithmetic instructions with low throughput and minimize control flow instructions to achieve maximum instruction throughput.Based on two-dimensional double Mach reflection problem, different CFDnumerical scheme is used on the CUDA architecture. In this paper, three schemes areused, Steger-Warming splitting scheme, Roe scheme, AUSMPW+scheme. Foranalyzing the reasons of the different speedup ratio of different scheme on CUDA, wecompare the three schemes with5th WENO scheme together.

Keywords/Search Tags:

CFD, CUDA, Parallel Computation, 5th WENO scheme, Program optimization, Speedup ratio

PDF Full Text Request

Related items

1	A Quick Orthographic Rectification Approach For Optical Remote Sensing Satelite Imagery Based On CUDA Architecture
2	Parallel Time-Domain Method For EM Scattering From Rough Surface With/without A Target Based On CUDA
3	Three-Dimensional Computations On Capturing Of Gas-Water Interface By Level Set Method
4	Research On Numerical Methods Of Fluid Dynamics And Parallel Computational Algorithms
5	Fast Convexhull Computation Parallel Design And Implementation Based On CUDA
6	Parallel Implementation And Optimization Of EBE-FEM Based On CUDA Platform
7	Research On GPU Parallel Algorithms For Gas Hydrostatic Lubrication Flow Field Based On N-S Equation
8	A Hybrid WENO Scheme For Resolution Optimization And Its Applications
9	High density ratio multi-component lattice Boltzmann flow model for fluid dynamics and CUDA parallel computation
10	The Bubble Dynamics Study Based On CUDA And Lattice Boltzmann Method