Research On DSL-based Multi-platform Parallelization And Optimization Techniques For CFD Applications | | Posted on:2023-02-22 | Degree:Master | Type:Thesis | | Country:China | Candidate:F T Chen | Full Text:PDF | | GTID:2568307169478404 | Subject:Software engineering | | Abstract/Summary: | PDF Full Text Request | | The parallel programming of application software is an important step to realize high performance computing.At present,high-performance computing architecture is increasingly developing towards heterogeneity and diversity,which brings great challenges to the development,portability and maintenance of parallel application software.DSLs(Domain Specific Languages)provide a feasible way to portable parallel programming.By giving up generality and focusing on a small problem domain,it is possible to achieve high performance and portability at the same time.Therefore,the DSL-based programming framework and its application software parallel programming technology have received general attention in the field of high performance computing.OPS(Oxford Parallel library for Structured Mesh Solvers)is a typical DSL-based programming framework that supports portable parallel programming of structured mesh application software.Based on the OPS framework,this thesis studies the multi-platform parallelization and performance optimization techniques of CFD applications.The main works of the thesis are as following:First,multi-platform parallel programming capability of OPS is evaluated and analyzed.The performance of MPI,MPI+OpenMP and CUDA parallel programs generated by OPS in scientific structured mesh applications are evaluated,and compared with the corresponding manually parallelized and optimized ones.The main factors of influencing the performance of OPS are analyzed.Second,OPS-based parallelization on multi-platform of the lattice Boltzmann method is designed and implemented.This thesis analyzes the algorithm characteristics of the lattice Boltzmann method.Multi-platform automatic parallelization according to the development method of OPS is achieved.Its performance is compared with that of the manual OpenMP parallelization.It is found that the performance of OPS is lower than that of the manual OpenMP parallelization.The analysis result shows that a large part of the performance gap is attributable to the memory access of OPS.Third,two memory access performance optimization techniques are proposed for application-level parallel code by OPS.Memory access is significantly reduced by decreasing the size of datasets of OPS.The memory access strategy in OPS is analyzed.To eliminate the adverse impact of loop fission on memory access performance,a loop fusion optimization strategy based on OPS is implemented.The experiment result shows that optimization strategy effectively improves memory access performance. | | Keywords/Search Tags: | Domain Specific Language, CFD applications, OpenMP, MPI, CUDA, performance evaluation, performance optimization | PDF Full Text Request | Related items |
| |
|