| Recently, High Performance Computing (HPC) has made great achievements in researchfields such as Engineering, Science and Military. International scholars generally agree thatthe enormous challenge facing in the21th century such as Global Climate forecasting, geneticrecombination and geological resource exploration can only be solved by the HPC. Therefore,the high-performance computing has been drawing a great concern both at home and abroad.In the1990s, the United States formulated research programs on HPC fields, for instance,ASCI and HPCC. In our country, the government also put a highly degree of concern on it.Now, massively parallel computing, experiment and theory has been the three pillars forhuman to understand the natural world.MPI (Message Passing Interface) is the predominant messaging standard for many HPCapplications, it is independent of language and platform. As a Library, programmer candirectly call these interfaces by FORTRAN and C language. MPICH is an open sourceimplementation of the MPI standard,now it is not only widely used in the industry but alsoattract the attention of the academic community.MSU and Argonne National Laboratorymade great contributions to MPICH. The emergence of MPICH facilitates people to writeefficient parallel applications. Furthermore, it promotes the development of high performancecomputing technology.MPI provides for a plenty of communication primitives, these primitives can be classifiedfor point-to-point or collective communications. By studying the profile we find that the timespending in collective operations accounts for great proportion of a transfer time in someapplications. Thus, in order to gain high speed-ups for parallel applications, we need toimprove the performance of collective communication. MPI_Bcast is one of the frequentlyused collective communications primitives, a so-called root process sends its message to allother processes. In this paper, we optimize collective operations on multi-core computingnodes based on mpich2and Balance_Bcast algorithms have been proposed on multi-clusters.This algorithm is mainly concerned about two key issues of the multi-cluster environment:(1) Cluster Size. The number of computing nodes in each cluster may be different.(2) Link Delay. Computing nodes are connected by high-speed network (like Infiniband) inthe cluster, but links delays between clusters are different from each other and relatively slower than those in local-cluster.In this paper, we focus on making the completed time of intra-cluster broadcasting be same.In MPI_Bcast in MPICH2, for small messages, binomial tree algorithms are widely employed.This algorithm doesn’t consider the links between nodes, so slow links may be frequent useand thus may degrade the performance of the algorithm. Optimize collective operationperformance just by making each cluster as a communicator doesn’t consider the differencesof cluster size, it maybe result that the time of the whole algorithm needed depend on somespecial cluster. Therefore the performance of this algorithm can’t be always well.We put the two key issues (cluster size and link delay) together and propose a algorithm ofBalance_Bcast. The algorithm evaluates the time needed by each cluster to completebroadcasting, then sorting the cluster based on the evaluating time of each cluster, we send themessage to the more important cluster at first and repeat the evaluation and sort until eachcluster exist a messaging node at least, the scheduling algorithm stops. Finally, we usebinomial tree algorithms to broadcast in local-cluster.Due to the limitations of the network conditions, we use NS2simulation software togenerate the network topology and testing the algorithm we proposed. By detailed simulation,we find that Balance_Bcast algorithm perform better than the algorithm mentioned above andachieve the expected goals of this work. |