Font Size: a A A

Research On Large-scale Vertex-cut Dynamic Graph Partitioning Algorithm

Posted on:2022-08-13Degree:MasterType:Thesis
Country:ChinaCandidate:H YuanFull Text:PDF
GTID:2480306602993049Subject:Software engineering
Abstract/Summary:PDF Full Text Request
As the amount of data grows,large-scale graph computation tasks need to run on distributed clusters.Graph partitioning is an indispensable data preprocessing step in distributed graph computation.It divides graph data into multiple parts according to a certain rule and distributes them to machines.Graph partitioning mainly affects the performance of graph computation from two aspects: 1)the partition load balance affects the time cost of graph computation because low-load parts need to wait for high-load parts to complete the task before entering the next iteration;2)the number of vertex-cut or edge-cut among parts affects the communication cost of graph computation because vertices need to communicate with remote neighbors across machines.The goal of traditional graph partitioning research is to partition a static graph,but most of the real-world graphs are constantly changing dynamically and growing in scale,which makes most static graph partitioning algorithms no longer applicable.Many graph applications need to interact with users in real-time,which requires the dynamic graph partitioning algorithm to have the following two characteristics: 1)it can process the dynamic changes in the graph in real-time;2)it can maintain a high-quality partition with lower overhead.However,the existing research on dynamic graph partitioning cannot meet the above two points at the same time.This thesis first analyzes the problem of dynamic graph partitioning theoretically and proves that the partition quality will decrease as the scale of the graph grows without optimizing.To solve this problem,we propose the concepts of local suboptimal partitioning and edge group.The local suboptimal partitioning uniformly defines the local structure that can be optimized,and the group edge is used to improve the local suboptimal partitioning.An edge group is a group of closely connected edges.Based on the difference in the degree of tightness between the edge group and each part,the migration of the edge group can transform the local suboptimal partitioning into the local optimal partitioning,thereby reducing local cutting.Based on these theories and concepts,we mainly propose a dynamic edge partitioning algorithm GR-DEP based on group reassignment,which is a complete dynamic graph partitioning solution.In GR-DEP,the dynamic changes enter partition in the form of a distributed stream.Each part deals with the dynamic changes in parallel and asynchronously searches and migrates the edge groups to improve the partition quality in real-time.The edge group search strategy is the core of GR-DEP and we propose two different strategies.The first one fully explores the local structure and can search for edge groups of different shapes and sizes.The second one uses the greedy strategy to find the edge group in a limited local structure,with low and stable time overhead.They are applied to different types of dynamic changes to complement each other so that a higher partition quality can be achieved with lower overhead.In the experiment,this thesis first analyzes GR-DEP,verifies that the algorithm can meet different partition quality and partitioning overhead requirements through parameter adjustment,and verifies the effectiveness of the hybrid edge group search strategy.Secondly,the thesis compares GR-DEP with different dynamic graph partitioning algorithms and verifies that the algorithm is robust to different initial partition quality,robust to dynamic changes of different types and scales,and is also scalable.Compared with state-of-the-art dynamic graph partitioning algorithms,GR-DEP can reduce vertex-cut by 29.5% on average.Finally,the experiment imports the partitioning results of the algorithms into the distributed graph processing system Powergraph.When the scale of the graph increases by about 40%,GRDEP can reduce the increase in communication cost and running time of graph computation tasks by 41.5% and 71.4%,respectively.
Keywords/Search Tags:Graph Partitioning, Dynamic Graph, Distributed Algorithm, Edge Group
PDF Full Text Request
Related items