Font Size: a A A

An Efficient CPU-GPU Solver Based On DEM And Its Application In Die Filling

Posted on:2014-03-28Degree:MasterType:Thesis
Country:ChinaCandidate:C S LuoFull Text:PDF
GTID:2250330401485480Subject:Computational Mathematics
Abstract/Summary:PDF Full Text Request
Particulate flows are commonly encountered in both engineering and environ-mental applications. The interaction of numerous particles brings the complexity oftheir motion characteristics, and presents a certain randomicity. The understand-ing of the interaction mechanism in these systems is of high demand. The discreteelement method (DEM) has attracted plentiful attentions since it can predict thewhole motion of the particulate flow by monitoring every single particle. Howeverthe computational capability of the method relies strongly on the numerical schemeas well as the hardware environment. Graphics processing units (GPUs) have re-cently burst onto the scientific computing scene as a technology that has yieldedsubstantial performance and energy-efciency improvements. Applications of GPUbased-DEM are limited but more and more popular nowadays, and numerical sim-ulations with actual-engineering-level numbers of particles are of especially highdemand.In this study, a parallelization of a DEM based code titled Trubal was im-plemented by following two steps:1. The static storage structure was firstly re-constructed;2. An essential parallelism on the relative newer code was furtherconducted based on a CPU-GPU heterogeneous architecture, where a texture mem-ory and shared memory without bank conflict technology was used to maximizethe frequency of GPU memory bandwidth. Numerical simulations were carried outand a comparison was made among the Trubal, relative newer serial code and thefinal parallel code to show the benefits of this research. It is shown that the dis-tributions of particles are basically the same, which strongly verifys their validityof the new codes. The final parallel code gave a substantial acceleration on theTrubal. By simulating6,000and60,000particles of200,000time-steps from someclassical moments using certain hardware conditions, speedups of4.69and12.78incomputational time were obtained for optimized GPU techniques respectively.
Keywords/Search Tags:parallel computing, discrete element method, die filling, CPU-GPU heterogeneous architecture
PDF Full Text Request
Related items