Research On GPU Efficient Data Reuse Mechanism

Posted on:2022-10-01

Degree:Master

Type:Thesis

Country:China

Candidate:X Y Li

Full Text:PDF

GTID:2558307169977839

Subject:Electronic Science and Technology

Abstract/Summary:

PDF Full Text Request

With the development of big data,the scale of data sets for GPU applications has increased dramatically in recent years,which raises challenges to the GPU memory with limited capacity.With the support of unified virtual memory and demand paging,GPU can execute in an application-transparent manner under memory oversubscription.How-ever,such transparent management still comes at a severe performance cost,especially for applications with inter-kernel data sharing.While there have been many efforts to reduce additional data migrations caused by the memory oversubscription,few consider the reuse of shared data during the boundary of adjacent kernels.Due to limited memory capacity,kernels often demand shared data that has already been evited by the previous kernel,resulting in a significant number of costly data migrations.Therefore,this paper focuses on the reuse of shared data between neighbored kernels.The main contributions are as follows:· Research on characteristics of GPGPU applications: This paper conducts an in-depth study of a great number of workloads in the GPGPU benchmarks.Based on the research,this paper systematically summarizes the programming modes and data access patterns of GPU applications.Besides,applications are classified and counted based on the programming modes.Finally,data sharing between kernels of some test programs was analyzed quantitatively.· CTA-Page cooperative data reuse mechanism: Based on the analysis of appli-cation characteristics and memory access patterns,this paper proposes a CTA-Page cooperative data reuse mechanism targeting applications with similar memory ac-cess characteristics in different kernels,called CPC.It transparently reduces the impact of memory oversubscription using CTA(Cooperative Thread Array)dis-patch switching and page replacement switching coordinately to reuse inter-kernel shared data.Experimental results show that CPC reduces the page fault rate by an average of 46.6% compared to the Baseline,leading to an average of 90% and65% performance improvement than the Baseline and the state-of-the-art memory management framework ETC,respectively.· Hardware-software collaborative data reuse mechanism: This paper proposes a universal mechanism,called JCS,to reuse the shared data between kernels.Based on the global access information of each CTAs obtained in the JIT(Just In Time)and the information of the GPU memory,the CTA scheduling strategy is re-planned so that CTAs with higher priority can reuse shared data more efficiently.Experi-mental results show that for single-kernel-type applications(multi-kernel-type ap-plications)JCS reduces the page fault rate by an average of 26.6%(6.6%)compared to the Baseline,leading to an average of 58%(12.6%)and 38.8%(5%)performance improvement than the Baseline and the state-of-the-art memory management frame-work ETC,respectively.

Keywords/Search Tags:

Heterogeneous system, Memory oversubscription, CTA--Page coordination, Unified memory, Just in time

PDF Full Text Request

Related items

1	Research On Application-transparent Strategies For Stacked Heterogeneous System
2	Efficient Hot Page Migration Mechanism For Heterogeneous Memory Systems
3	System-level Data Partitioning Policy Based On Hybrid Memory
4	Heterogeneous Processors and Memory Systems
5	Research On Hybrid--grained Page Management In Hybrid Memory Systems
6	Study On Page Migration Algorithm Of Hybrid Memory System Based On Emerging Non-volatile Memory
7	Analysis And Visualization Of Memory Access Characteristics In Heterogeneous Memory
8	Virtual Memory Paging Mechanism Based On Distributed Heterogeneous Memory Pool
9	Unified Memory Management And System Performance Optimization For GPU-Based Large-Scale Analytical Query Processing
10	Research On Efficient Data Communication Between CPU And GPU Through Transparent Partial-Page Migration