Font Size: a A A

Research On Heterogeneous Parallel Algorithms For Sparse Matrix Computation

Posted on:2016-03-16Degree:DoctorType:Dissertation
Country:ChinaCandidate:S GuoFull Text:PDF
GTID:1310330536967208Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The heterogeneous high performance architecture is becoming one prevailing architecture of the high performance computing area,which boots the progress of the largescale scientific and engineering computing.The sparse matrix operations are one kind of important operations in the large-scale scientific and engineering computing.However,as one kind of irregular computing,the sparse matrix computing limits the performance of the high performance architecture.This reason mainly results from two perspectives.One is the behaviors of the sparse matrix computation depend on the sparsity structure of the sparse matrix,which makes the locality not be fully exploited,and the second one is that the existence of indirect and random memory accesses,which result in the inefficient utilization of the memory bandwidth,one of bottleneck for the sparse matrix computation.In order to address these issues and challenges,this paper focus on sparse matrix computation on heterogeneous architectures.The main work and innovations are as follows:(1)One optimized sparse matrix-vector multiplication algorithm for the heterogeneous architecture is proposedThis paper proposes a workload-based sparse matrix-vector multiplication algorithm with multiple parallelism modes.In order to address the irregular memory accesses and imbalance workload,this paper partitions the sparse matrix to reduce the redundant memory space overhead,chooses the appropriate thread mapping strategy on the workload,and design one warp-based workload partitioning method to reduce the synchronize overhead.Experimental results show that the performance of the algorithm can be improved greatly.(2)The optimized sparse matrix-transposed-vector multiplication algorithm for the heterogeneous architecture is proposedThis paper proposes a transpose-free algorithm for sparse matrix-transposed-vector multiplication.The out-product way is employed to reduce the overhead of the transposition and atomic operations,and multi-way merged sorting is exploited to eliminate the atomic operations.Experimental results show that the performance of sparse matrixtransposed-vector multiplication can be efficiently implemented.(3)One optimized sparse matrix-sparse matrix multiplication algorithm for the heterogeneous architecture is proposedThis paper proposes a CPU-GPU collaborative sparse matrix-sparse matrix multiplication algorithm on the heterogeneous platform.Many optimization methods are employed to improve the performance,including upper-bound and iterative-increment memory allocation strategy,many-way merge operation and parallel prefix-sum scan to reduce the overhead of the accumulation of the intermediate results,and the workload partition strategy based on the task of each thread.Finally,one CPU-GPU collaborative sparse matrix-sparse matrix multiplication is proposed to improve the performance further.(4)One optimized sparse triangular solving algorithm for the heterogeneous platforms is proposedThis paper proposes one level-set parallel algorithm for the sparse triangular solving.Based on the parallelism of different levels,one hybrid parallel scheme,consisting of cluster parallel mode and pipeline parallel mode,is employed to exploit the parallelism efficiently.The cluster mode is used to process the level with a large amount of work,while the pipeline mode is used to process the level with small amount of work.On this basis,one load-balancing algorithm is designed based on the workload of each level.Finally,one CPU-GPU collaborative sparse triangular solving algorithm is proposed to make use of all the computing resources of the heterogeneous parallel computing platforms.
Keywords/Search Tags:sparse matrix computation, parallel computation, heterogeneous architectures
PDF Full Text Request
Related items