An Efficient Communication Method For Large-scale Graph Processing In Data Centers

Posted on:2022-10-23

Degree:Master

Type:Thesis

Country:China

Candidate:Y W Wu

Full Text:PDF

GTID:2480306572991189

Subject:Computer software and theory

Abstract/Summary:

PDF Full Text Request

With the rapid growth of real-world graphs that can easily exceed the on-chip(board)storage capacity of an accelerator,analyzing large-scale graphs on a single FPGA-based graph processing accelerator becomes difficult.The multi-FPGA acceleration is of great necessity and importance.Many cloud providers(e.g.,Amazon,Microsoft,and Baidu)now expose FPGAs to users in their data centers,providing opportunities to accelerate large-scale graph processing.However,there are two main challenges in extending the existing single FPGA graph accelerators to the multi FPGA graph processing system in the data center: firstly,because the existing single FPGA graph accelerators are equipped with customized programming model,runtime system and communication runtime,it is difficult to reuse the infrastructure to produce new distributed accelerators;Secondly,when the distributed graph accelerator running in the data center does not consider the particularity of the torus interconnection scheme,there will be a lot of unnecessary communication overhead.A communication library for efficient large-scale graph processing on FPGA-accelerated data centers,called FDGLib,can easily scale out any existing single FPGA-based graph accelerator to a distributed version in a data center,with minimal hardware engineering efforts.FDGLib provides 6 APIs that can be easily used and integrated into any FPGA-based graph accelerator with only a few lines of code modifications.Considering the torus-based FPGA interconnection in data centers,FDGLib also improves communication efficiency using simple yet effective torus-friendly graph partition and placement schemes.We interface FDGLib into Hit Graph,a state-of-the-art graph accelerator.Our results on a 32-node Microsoft Catapult-like data center show that the distributed Hit Graph can be 2.32� and 4.77� faster than the state-of-the-art distributed FPGA-and CPU-based solutions(i.e.,Fore Graph and Gemini),with better scalability.

Keywords/Search Tags:

Data Center, Accelerator, Graph Processing, Distributed Architecture, Communication Optimization

PDF Full Text Request

Related items

1	Optimization On Computation Costs And Communication Efficiencies Of Distributed Graph-Processing Systems
2	Research On Meteorological Data Processing And Visualization Platform Based On Distributed Architecture
3	Research On Several Distributed Graph Processing Algorithms And A Unified Graph Programming Framework
4	Research On Communication Optimization Of Distributed Graph Parallel Computing
5	Research On Energy Conservation Of Geo-Distributed Data Center
6	Optimization On Partitioning Methods And Processing Workflow Of Distributed Graph-Processing Systems
7	Research On Distance Computing In Distributed Graph Data Management System
8	Distributed Multi-task Learning Algorithms Based On Graph Signal Processing
9	Research On Graph Partitioning Algorithms In Distributed Environment
10	Dependency-aware Incremental Processing Accelerator For Directed Evolving Graph