| With the rapid growth of interconnected data,data not only tends to increase rapidly in size,but also tends to become more numerous and complex in the form of connections between entities.Compared to data structures such as forms,graph data structures have greater advantages in representing these connections.When performing largescale graph computing,there are two main requirements:the ability to accommodate and process large-scale data,and the need for high-density computing power to achieve low response latency.Correspondingly,large graph analysis systems are mainly built based on two routes:CPU distributed graph analysis systems based on different graph computation models,or single-node graph analysis systems based on high-performance computing hardware(including GPUs,FPGAs,etc.).However,both solutions only meet some,but not all,of the requirements posed by modern big graph analysis scenarios.Therefore,it becomes necessary to design a big graph analysis system that can support both scaling aspects.In terms of computational models,graph analysis systems and their upper layer graph analysis services correspond to iterative graph computation and aggregated graph computation,respectively.Iterative graph computation is one such computational model,which is usually based on graph topology and performs data analysis in the same way as iterative graph computation.Aggregate graph computation produces small and easy-to-understand summaries of large-scale graph data sets and provides a centralized representation of data features specified in the application scenario.Both iterative graph computation and aggregated graph computation face challenges arising from large data sizes and high-density computing power requirements.Therefore,we want to develop high-performance graph analytics middleware that can provide both horizontal scale support on clusters and vertical scale support on computing power for both iterative and aggregated graph computing.In view of the above requirements,this paper designs a high-performance graph analysis middleware that supports horizontal and vertical scaling of clusters for graph analysis systems,which can access distributed graph analysis systems and highperformance computing hardware,and can also support iterative and aggregated graph computation,and based on which a series of system optimizations are carried out.Our main research can be described in three parts as follows:1.Generic graph computing framework design.The main goal of this part is to build a middleware framework that can support both horizontal scaling at cluster scale and vertical scaling at arithmetic scale,which can be divided into the following parts:the Daemon-Agent part is designed to ensure the scalability of middleware at both horizontal and vertical levels at the core level;the shared memory-based data transfer process and the flexible implementation and access to the The shared memory-based data transfer process and flexible implementation and access APIs provide efficient processing and diversity support for graph algorithms at the data and program levels.The middleware also provides a series of targeted optimization measures for different deployment environments to ensure that deployments on it can be executed efficiently.2.Middleware optimization for iterative graph computation.The main goal of this section is to dissect the execution flow of the system layer by layer,find the problems that lead to potential poor performance,and optimize them.These optimizations are organized at three levels:within each computation iteration,to solve the problem of low utilization of computational hardware due to excessive data flow processes,we design a pipeline reorganization mechanism,which aims to improve the utilization of computational hardware and the overall performance of the system while ensuring efficient data exchange;between computation iterations,to solve the problem of additional In between computation iterations,to address the additional overhead caused by the redundant data synchronization process,we design a synchronous caching mechanism and a synchronous skip mechanism to reduce or even eliminate the performance impact of data synchronization on the system according to local conditions;under the overall framework outside the computation process,to ensure that the system can achieve load balancing,we design a set of cost models to evaluate the system overhead,and on this basis,we propose two different modes of system load balancing schemes to help the system achieve better performance overall.3.Support for applications related to aggregated graph computing.To enable the system to handle more complex real-world application scenarios,we introduced a computational model for aggregated graph computation and performance optimizations into the middleware framework.These optimizations are organized at three levels:at the thread level,we implement lock-free aggregation through a custom BSP model,thus eliminating the additional overhead of locks.At the thread block level,we implement data distribution-aware flow graph partitioning to achieve a uniform distribution of aggregated data among partitions;at the overall system level,we build a cost model to evaluate system performance and guide the configuration of thread block sizes to help the system achieve better execution performance.to help the system achieve better execution performance.In summary,in this paper,we present GX-Plug,a middleware that allows one to build customized graph analytics systems with support for both iterative and aggregated graph computation,and provides flexible scaling in terms of cluster size and computational power.On top of this,we propose several performance optimizations for graph analytics systems,which guarantee efficient performance in both horizontal and vertical scaling. |