Font Size: a A A

Research On Distributed Processing And Load Balancing Method Of Graph Join

Posted on:2021-06-04Degree:MasterType:Thesis
Country:ChinaCandidate:Z C SongFull Text:PDF
GTID:2480306107953219Subject:Computer technology
Abstract/Summary:PDF Full Text Request
In recent years,with the development of computer and artificial intelligence,in the Internet and industry,the data volume of graph data is larger,so the management of such a large number of graph data has become an inevitable problem.In the field of data management,the database is the most popular way of data management.In the distributed graph databases,the current solutions are the centralized join and Map Reduce join.In the centralized join scheme,all data is fetched to the same node to execute the join,and it can bring severe network congestion and unbalanced distributed node load.In the Map Reduce join scheme,due to the Map Reduce execution framework's limitation,it can bring extra computing tasks,such as shuffle and redundant sorting.The distributed graph join scheme has completed the join of intermediate data through the multi-node cooperation.It used the concept of non-adjacent edges,combined with the statistical information of each node to generate a suitable distributed join plan.The distributed join plan is decomposed and sent to the appropriate nodes for execution based on statistical data.In the distributed environment,each node executes a part of the join plan,thereby achieving the purpose of the final data join.Experiments prove that the distributed graph join scheme has a better improvement in the execution time of the global join task and the load balancing performance in the distributed environment during the join task execution,compared to the centralized join.In addition,load migration strategy by estimating the execution difficulty of a join task and finding the appropriate data positioning point through sampling,and select the appropriate agent computing node,decompose,and migrate the task.load migration strategy achieves load balancing in a high load distributed environment,reduces the probability of "hot points",and improves the overall efficiency of distributed graph join tasks in a high load environment by estimating the difficulty of executing a join task,and selective decomposition and migration.Avoid heavy query tasks being killed because of insufficient system resources.
Keywords/Search Tags:distributed graph database, distributed join plan, load migration, load balance
PDF Full Text Request
Related items