Font Size: a A A

Source-to-Source Parallelization Research Of Loop For CUDA

Posted on:2015-07-14Degree:MasterType:Thesis
Country:ChinaCandidate:X Y SunFull Text:PDF
GTID:2298330422983515Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
In recent years, GPU has been widely used in high performance computing. GPUcomputation except graphics rendering is called GPGPU. It is difficult to develop traditionalGPGPU since graphics API is directly used for programming. Nowadays, CUDA has beenwidely used by reducing the difficulty of writing parallel programs. But developing parallelprograms manually based on CUDA is also a challenge for programmers, becauseprogrammers must deeply master GPU architecture and CUDA model. Reducing thedifficulty of developing parallel programs is very important for the popularization andapplication of GPGPU.We study the problem of generating GPU parallel programs automatically and propose asource-to-source parallel architecture called STS-CUDA in this paper. STS-CUDA cantransform serial programs with loop into CUDA C parallel programs on GPU, which makesthe CUDA parallel programming more convenient. Working processes of STS-CUDA are asfollows, first analyzing serial C program and inserting directives of STS-CUDA which isrelated to parallel transformation in its proper place, and then converting it to correspondingCUDA C parallel program through recognition and matching of STS-CUDA. Methodsinvolved in the process of realizing parallel transformation using STS-CUDA are studied inthis paper, including dividing tasks reasonably, optimizing communication of host and device,optimizing access of global memory and shared memory, and so on. Examples are tested inthe end.The results and speedup of two matrix multiplication parallelization and a BP algorithmparallelization experiments through STS-CUDA are similar to that of handwritten CUDAparallel program. We can further study how to completely shielding GPU bottom architecturethrough reducing STS-CUDA directives and how to optimize target codes through addingmore optimization methods.
Keywords/Search Tags:GPU, GPGPU, CUDA, Source-to-Source Parallelization
PDF Full Text Request
Related items