| The solution of large-scale sparse linear systems is the most basic component of numerical methods for partial differential equations,which plays a very important role in modern scientific and engineering applications.With the continuous development of high-performance computing technology,modern scientific and engineering applications are moving toward the direction of large scale and high precision,and it has become an important challenge and urgent need to study the heterogeneous parallel algorithms for solving large-scale sparse linear systems to support modern scientific and engineering applications for large-scale and high precision numerical simulation.In this thesis,the heterogeneous parallel direct and iterative algorithms for solving large-scale sparse linear systems on Sunway manycore architecture are studied,and the main contributions are as follows:(1)A heterogeneous direct solution algorithm based on LU factorization is proposed for solving sparse linear systems on Sunway manycore architecture.The algorithmic characteristics and coupling of each step of the direct solution algorithm are analyzed,and the computational block and computational task heterogeneous parallel strategy are proposed for the computational characteristics of the multilevel graph partitioning-based permutation algorithm,the LU factorization-based numerical factorization algorithm and the triangular solution algorithm respectively.Based on the computational block heterogeneous parallel strategy,the heterogeneous parallel permutation algorithm based on multilevel graph partitioning is proposed;based on the computational task heterogeneous parallel strategy,the heterogeneous parallel numerical factorization algorithm based on LU factorization and the heterogeneous parallel triangular solution algorithm are proposed.Meanwhile,the heterogeneous computational block parallel algorithm and a series of heterogeneous matrix-matrix and matrix-vector operations parallel algorithms are proposed.Numerical experiments are conducted using Florida sparse matrix collection and real application respectively,which show that the heterogeneous parallel algorithm proposed in this thesis significantly improves the computational efficiency of the direct solution algorithm.(2)To address the communication efficiency and communication bottleneck of the direct solution algorithm,a binary tree communication model and a group communication model are proposed and numerical experiments are conducted using the Florida sparse matrix collection.The experiments show that the binary tree communication model significantly improves the communication efficiency of the direct solution algorithm and the group communication model breaks the memory bottleneck of large-scale global communication and improves the parallel scalability of the algorithm.(3)To address the optimal permutation algorithm and parameter selection problems of the direct solution algorithm,the tuning of permutation algorithm and parameter tuning of factorization algorithm on Sunway manycore architecture are realized through extensive analysis and experiments.(4)A new parallel randomized iterative algorithm is proposed for the parallel solution of overdetermined sparse linear systems.For the characteristics of low computational density,tight coupling,and difficulty in parallel computation of the randomized iterative algorithm,the greedy sampling strategy and delayed approximation strategy are proposed respectively to realize the decoupling of the algorithm and significantly improve the parallel computational efficiency of the algorithm.In addition,a residual estimation strategy is proposed,which significantly reduces the additional computational cost required to compute the residuals.Numerical experiments were conducted using random overdetermined linear systems,and it was shown that the parallel randomized iterative algorithm proposed in this thesis significantly outperforms parallel Krylov subspace algorithms. |