| Nowdays,distributed memory architecture of the mainstream high performance computing systems becomes one of the research focuses in parallel compilation techniques because of its good scalability and application prospect. In distributed memory architecture parallel system, for any computing node, the cost of remote memory access through net is much more than the cost of local memory access. For this reason, the technology about communication plays an important role in the reach of paralleling compiling for distributed memory architecture.This dissertation is based on the developing of SW-VEC parallelizing compiler. It talks about the reach of the loop redistribution on paralleling compiling process for distributed memory architecture. The main contents and contributions are as follows:1. Accurate communication between loops. The traditional redistribution communication between loops usually contaions a large amount of redundant communication data. Aiming at this problem, this paper proposed an accurate communication code generation algorithm, which get the accurate communication set by question the intersection between the sets of local data space before and after the redistribution communication. According to the logical relationship between communication processor, redistribution communication can be divided into data reorganization communication and neighbor communication in wich using different communication primitives and generating specific communication code. The experimental results show that the algorithm can effectively improves the redistribution communication between loops, eliminates communication redundancy and improves the performance of parallel programs.2. Cost evaluation of redistribution communication between loops. In distributed memory architecture parallelizing compiling process, the decomposition of computation and data in loops is dynamic. And the dynamic decomposition process need to evaluate the potential communication cost as the guidance. Previous cost evaluation method does not conside the communication realization as well as the target machine hardware characteristics, so it is lack of a more effective guidance for dynamic decomposition. Based on the accurate distribution communication method, this paper create a cost model for the corresponding redistribution communication. In the model, the cost of distribution communication can be divided into three parts of packing cost, data transmission cost and unpacking cost. Combining communication primitives, the refined realization and costs of data reorganization communication and neighbor communication are evaluated. The experimental results show that, using this cost analyzing algorithm as the reference in the dynamic decomposition process can effectively avoid the disadvantage paral elizing behavior in the parallelization compiling period.3. Communication code generation for a class of irregular loop. Irregular application also has the possibility to be parallelized. For the irregular array access, traditional affine decomposition can’t describe its access function, the compiler can’t decompose the loop and generate the communication code and have to give up the parallelization. For a class of irregular loop, which containing parallel affine boundary layer, this paper put forward a method of decomposition and communication generation. In this way, a special affine function is used to describe the access function of the irregular array and the redundant communication techniques is used to meet the producer-consumer relationship with irregular loop.The experimental results show that this method is effective and has achieved the expected speedup.The optimization technology in this dissertation is realized and applied in the SW-VEC parallelization compiling system, its correctness and efficiency are validated. |