Parallel Design And Optimization Of A Galaxy Group Finding Algorithm On SGI Vs. Distributed-memory Cluster

Posted on:2018-11-19

Degree:Master

Type:Thesis

Country:China

Candidate:Y M Si

Full Text:PDF

GTID:2370330596989159

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

As one of the most important research topics in astrophysics,galaxy group finding offers great help in learning the formation and evolution of galaxy groups and large-scale structure in the universe.Halo-based Galaxy Group Finder(HGGF)is an effective algorithm based on dark matter halo that accomplishes the task of galaxy group finding according to galaxy coordinates,redshift and mass etc.,and provides help in these researches.However,current pure OpenMP implementation of the algorithm is limited by the programming model,therefore its application in large-scale group finding problems is undermined since it can only utilize resource of a single underlying compute node.One of the possible solutions is to solve such problems using Coherent Shared Memory(CSM)machines which has large CPU and memory resource,however its daunting price might be a huge concern.Meanwhile,its effectiveness for such algorithm should be tested.Another solution is to use resources from multiple cluster nodes to reduce execution time while making large-size problem solving possible.For the latter approach,it is essential to redesign and implement the algorithm,and one of the major hurdles for such an attempt is remote memory access due to semi-random galaxy accesses in the algorithm which damages the performance in multi-node environment.To tackle such a problem,we paralleled the algorithm with adjacent galaxy list design and used Unified Parallel C(UPC)to implement it.2.25,2.78 and 5.07 times speedup for the kernel were achieved with 4,8 and 16 nodes respectively.Moreover,the memory requirement on each node was also reduced significantly.Experiments of OpenMP version of the algorithm on SGI UV 2000 show that due to the nature of the program and the features of NUMA architecture,programs with random memory access behavior like HGGF may not readily benefit from the large number of threads and shared memory provided by such CSM machines.Two-level parallel design that takes advantage of locality principle on distributed memory clusters solves the HGGF problem more efficiently.

Keywords/Search Tags:

High performance computing, Galaxy group finding, Parallel computing, UPC, OpenMP

PDF Full Text Request

Related items

1	Research On OpenMP 4.0 Based Heterogeneous Parallel Computing Techniques For CFD Applications
2	Research Of Parallel Computing In Computational Fluid Dynamics
3	The Research And Implementation Of Parallel Computing On Numerical Simulation Of The Explosion And Impact Phenomenon
4	The Parallel Computing Research On High-performance Spatial Analysis Under Cpu/Gpu Heterogeneous Environment
5	Parallel Calculation And Optimization Research Of Magnetic Shell Parameter L Value Based On OpenMP
6	Study On The Parallel Computing In GRAPES High Resolution Numerical Weather Prediction Model
7	Research On The Key Technologies Of High Performance Computing WebGIS Model
8	Research And Implementation Of Visualization Methods On High-performance Computing Platform
9	Research On High-performance Computing Methods For Time-domain FWI
10	Research On Parallel Solution For Cutting & Packing Problem Based On GPU High-Performance Computing