Font Size: a A A

Parallel Preprocessing Algorithm On Multi - CPU Multi - Core System

Posted on:2015-09-11Degree:MasterType:Thesis
Country:ChinaCandidate:B YangFull Text:PDF
GTID:2270330467450497Subject:Computational Mathematics
Abstract/Summary:PDF Full Text Request
Solving large linear system is one of the basic tasks in scientific and engi-neering computing. Its performance tends to affect the efficiency of the global numerical simulation. At present, more than ninety percent of top500paral-lel computers in the world use the multicore architecture. So it has important practical significance to design algorithms for solving linear systems on parallel machines of multicore architecture.Algorithms for solving linear systems are usually composed of three aspects: the basic operations, solving methods and acceleration techniques. The basic operations include the matrix-vector multiplication, vector update, vector inner product, solving triangular systems and so on. Solving methods mainly include splitting iterative methods, the Krylov subspace iterative methods, multigrid methods. Accelerated techniques mainly include parallelization technology, pre-conditioning technology.With the continuous development of parallel computer architecture, it shows a multi-level characteristics, just as clusters-nodes-CPUs-cores. In this architec-ture, the cost of communication between nodes is more expensive, while com-munication in different cores which is in the same CPU is cheap. This paper gives a two-stage preconditioned algorithm for this architecture, which aimed at improving the efficiency of nowadays parallel architectures sufficiently. The first stage is restricted additive Schwarz techniques(RAS) based on domain decom-positions. This technique is of little communication and good convergence. The second stage focus on fast computation among cores in CPU, which is realized by thread-level programming. We designed two different schemes:RAS-block Ja-cobi preconditioner and RAS-RAS preconditioner. The numerical experiments show that the two stage preconditioners are more efficient than one stage pre-conditioner, and they are of good scalability.The matrix vector multiplication is of the highest complexity in the basic matrix operations. So we have designed a kind of matrix-vector multiplication algorithm suitable for multicore parallel computer. Based on compressed sparse block format (CSB), we gives a multicore CSB storage format and the sparse matrix-vector multiplication based on it. The numerical experiments show that the speedup based on our format is higher, and good scalability has obtained for banded matrices.
Keywords/Search Tags:Two stage preconditioner, RAS-block Jacobi preconditioner, RAS-RAS preconditioner, Multicore CSB format
PDF Full Text Request
Related items