Font Size: a A A

Parallel Algorithms Research Of Monte Carlo Method On Heterogeneous Architecture

Posted on:2019-11-12Degree:MasterType:Thesis
Country:ChinaCandidate:X YeFull Text:PDF
GTID:2370330590951773Subject:Nuclear Science and Technology
Abstract/Summary:PDF Full Text Request
As one of the two core methods of reactor physics analysis,the Monte Carlo method has two prominent advantages,fine geometric modeling capability and continuous energy point cross section.Compared to the deterministic method,the Monte Carlo method gives an accurate distribution of the neutron flux in the reactor by simulating the particle history.With the rapid development of supercomputers,the Monte Carlo method is playing an increasingly important role in reactor fine calculations.The emergence of supercomputers to a certain extent solved the problem of excessive computational resources required for the Monte Carlo code,but the efficiency of the Monte Carlo program itself was very limited.Based on the Reactor Monte Carlo code RMC,this research analyzes the hot-spot function in particle transport calculations.It proposes two general optimization methods,suffix expression and pointer addressing,which greatly increase the critical computational efficiency.RMC implements a layered parallel algorithm based on MPI and OpenMP,aiming at solving the problem of high memory consumption of single node in Monte Carlo calculations by sharing memory.This study found that the use of OpenMP in RMC has efficiency problems.In the case of multi-threading,the speedup ratio is always not ideal.It is proposed to use Pthreads instead of OpenMP to achieve thread-level parallelism.The test results show that the thread-level parallelism implemented by Pthreads has a very good acceleration effect.It is close to the linear speedup within the node,and it is also close to linear in the parallelism with MPI.The Sunway TaihuLight is a brand-new supercomputer independently developed by China.The peak performance is 125.4 TFlop/s,and the continuous performance is 93 TFlop/s.The whole system adopts the domestic SW26010 processor.The processor uses an on-chip heterogeneous fusion architecture to achieve complete autonomous control of software and hardware.This research ported the RMC code to the Sunway TaihuLight and solved the three types of difficulties encountered in the process.Correctness verification and parallel efficiency test are performed on the ported program.The results show that the correctness of the program can be guaranteed.In the case of using thousands of core groups,there is still more than 90%parallel efficiency.Aiming at the hardware features of ShenWei processor,a series of optimizations are made for the ported code,including optimization of memory access,SIMD vectorization,register communication,and master-slave cooperation parallelism.Comparing the performance of the program before and after optimization,the results show that it can generally be improved by about twice.However,compared with commercial Intel platforms,the results show that the optimized program still has a 30%to 50%performance gap.
Keywords/Search Tags:Monte Carlo, Layered Parallelism, Heterogeneous Parallelism, Sunway TaihuLight, Efficiency Optimization
PDF Full Text Request
Related items