Font Size: a A A

Research On Optimization Of OpenCL Based On AMD Platform And Its Application In Molecular Dynamics

Posted on:2016-10-15Degree:MasterType:Thesis
Country:ChinaCandidate:C L ZhaoFull Text:PDF
GTID:2180330479476634Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Open CL was first proposed in 2008 by Apple company. Now, it becomes heterogeneous programming framework which is managed by the Khronos Group. The purpose of Open CL is to propose a general purpose parallel programming standards and framework, the program can run on various devices produced by different manufacturers.Open CL’s typical applications include matrix operation, image processing, molecular dynamics. How to maximize the computation capability of heterogeneous platforms with Open CL has been a hot research issue.This thesis takes Open CL optimization based on AMD platform as the research focus. Firstly the evolution of GPU, focusing on the GPU architecture development process on AMD platform features and corresponding changes is summarized; Secondly Open CL optimization methods on AMD platform which include memory optimization, kernel optimization and so on are studied. Finally, the molecular dynamics package-LAMMPS’s overall architecture and how to improve the neighbor list algorithm and short-range force algorithm are researched.The main work of this thesis is following:(1) Investigating the present heterogeneous parallel computing development, selecting the Open CL based on AMD platform as the research object;(2) Analyzing the accelerating theory on heterogeneous accelerate platform and the features of Open CL framework, finding that the two main parts, memory optimization and kernel optimization, may cause bottleneck and performance penatly;(3) Combined with the AMD platform architecture feature, summarizing the Open CL solution for memory optimization and kernel optimization problems. For the three parts of memory optimization, host memory allocation, global memory access and local memory access, deeply analyzing and summarizing them; for the kernel part, gets 11 programming rules which can accelerate computation performance;(4) According to the Open CL code in the accelerate package in LAMMPS, includeing neighbor building and short range force calculation, analyzing the implementation mechanism,(5) Analyzing the LAMMPS neighbor list implementation process, implementing a single GPU radix sort algorithm which can be used in the sub processing of building cell lists. Finally a dual GPU radix sort algorithm design is provided;(6) According to the LAMMPS Open CL acceleration package code and the summary of optimization rules, focusing on the optimization kernels of short-range force implementation. At last, testing and verifying the correctness of the conclusion.
Keywords/Search Tags:OpenCL, AMD, optimization study, molecular dynamics, LAMMPS, short range force
PDF Full Text Request
Related items