| Loading balancing and iterative solution are the two major factors that impact the parallel efficiency of structured multi-block CFD applications. Grid block is usually the minimal unit of process load allocation in order to achieve load balance for parallel computing of structured multi-block grid CFD problem. For larger parallel scale and better scalability, repartition of grid blocks is always required. Traditionally, explicit repartition is popular before the parallel solution of CFD applications, that means to divide the grid blocks into groups of blocks with nearly total number of grid point for each groups. But in practical engineering application, may need much times of explicit repartition, makes pre-processing time-consuming and complicated. Implicit algorithms have good stability in CFD simulation, while Jacobian matrix is often too difficult to get and too large for storage. Presently coarse approximates to the Jacobian matrix are adopted in implicit algorithms, that always reduce the convergence rate of the iteration. Jacobian-Free Newton-Krylov is a newly-developed nonlinear numerical method. With it only the multiplication of Jacobian matrix with vector is needed, and direct solution of Jacobian matrix can be avoided.PETSc is a typical parallel numerical open source framework, providing parallel data structure of vector, matrix and so on. It shields details of the implementation of the common numerical algebra algorithm, parallel computation and communication, thus reduces the difficulty of development for parallel programs to a great extent. PETSc provides not only parallel data structure named Distributed Memory Distributed Array(DMDA) which is focused on single structured grid, but also many nonlinear and linear methods for scientific computing, including the JFNK algorithm. For large scale parallel computing of structured multi-block grid CFD, based on the traditional explicit repartition of grid blocks, we replace the original left-hand solution of the CFD software with the nonlinear JFNK solver from PETSc. And based on it, we use the new data structure named Distributed Memory Distributed Array(DMDA) of PETSc, to avoiding explicit repartition.(We call it implicit repartition)For the real three-dimensional structure of grid CFD applications, that use the data structure and JFNK algorithm from PETSc, to implemente on massively parallel JFNK nonlinear solver, is a number of key issues in CFD solver calculation. It includes numerical method, the parallel software framework, grid repartition, software integration and coveragce testing. Therefore, this paper carried out the following work:(1) The relations between the function modules and data streams in CFD computations are clarified based the analysis of the original CFD code. The numerical model of the JFNK in CFD application is established, such as the nonlinear function of the steady flow, nonlinear function of unsteady single time step and dual time step flow, JFNK algorithm and some kind of it’s preconditioners. The scheme of the integration between the JFNK method and a CFD software is presented. And the PETSc parallel software framework is analysed comprehensively, including the data structure, numerical calculation function, software architecture and JFNK function supported. Research on based data structure, software architecture and function of numerical computing supported from PETSc, clarify the general JFNK method in the PETSc framework and its realization, laid the foundation for the second development based on PETSc framework.(2) The implementation of integration between JFNK parallel solve from PETSc and the original CFD code. The data structures and computational workflows in CFD are reorganized according to the data and functional requirements of parallel computing in PETSc. Among them, based on the traditional explicit repartition, the emphases are put upon the reorganization of the right- hand parts needed by the JFNK nonlinear solver and functional modules that can be used as the JFNK preco nditioners in the original CFD code.(3) The implementation further of JFNK parallel solve based on(2) whithout explicit repartition. The study found a new data structure- distributed PETSc array data structure(DMDA) with automatic load ing balanced potential. According to structured multi-block CFD, the original grid blocks are building the MPI communication relatively independent subsystem. A MPI communication subsystem is established for each original grid block, and the new data structure named Distributed Memory Distributed Array(DMDA) of PETSc is adopted for automatic efficient parallel computing. Through using the newly-developed function provided by the DMDA parallel data structure, it makes the implicit repartition of grid blocks possible.(4) Two versions of the code are tested on the "TianHe-2" supercomputer with different problem size, and results are analysed. Three-dimensional steady cylinder flow is choosed as test case, with the largest grid 1.35 billion and the maximum size of CPU cores 5184. Research shows that, for the structured multi-block grid CFD simulation, two versions of code developed in this paper can achieve feasibility, implicit repartition version has relatively better performance, and implicit repartition version has good scalability. |