| As an important infrastructure supporting cloud computing services,the data center provides a strong guarantee for data storage and processing.In recent years,with the rapid development of artificial intelligence and big data technology,the amount of data processing has grown exponentially.Traditional data centers have been unable to meet large-scale and diverse computing needs.Geographically distributed data centers can use multiple data centers to coordinate Computing power to meet the corresponding needs.However,geographically distributed data centers face more complex data management and task scheduling scenarios due to the dispersion of data files,high data transmission costs,and network resource limitations.Therefore,to address these issues and to optimize the performance of the data center,this paper mainly studies the problem of task placement and data transmission optimization for geographically distributed data centers,and establishes a relationship model among tasks,data centers,and data files through hypergraphs,proposed a joint scheduling method based on hypergraph partitions,and further improved the performance of the scheduling method by improving the existing graph neural network.The specific research work is as follows:(1)Aiming at the scenario of geographically distributed data centers,this paper establishes a relationship model among tasks,data centers,and data files.Then,proposes a joint scheduling method for task placement and data migration based on hypergraph partitioning.The method is mainly divided into two stages: first,due to the outstanding performance of hypergraph in modeling complex problems,the relationship model between tasks,data centers and data files is established by using hypergraph,and the optimization problem is described;then,designed and developed a hypergraph-based partitioning method for the first stage of task placement.Meanwhile,combined with the dependence of each task on data,designed a task reassignment scheme and a data dependency-aware transmission scheme for the hypergraph partition results are further optimized to minimize the task completion time and reduce the amount of data transmission.(2)Based on the previous work,the advantages of graph neural networks in acquiring structural features are incorporated for further performance optimization,and by improving the feature learning method of graph neural network,an improved graph neural network scheduling method is proposed.This method aims to minimize the task completion time.Firstly,the hypergraph model is combined with the graph neural network model and improvements are made to the convolution method of the graph neural network.At the same time,an attention mechanism is introduced to improve the learning ability of the graph neural network,so that it can efficiently and accurately divide the task.The convolutional layer is used to continuously learn the node information,so that it can effectively apply a variety of large-scale task types,and finally output the task placement results through node classification to minimize the task completion time.(3)The data set of the China-VO project was mainly used in the experiment,and compared with other classic algorithms to verify the effectiveness of the proposed method.Experimental results show that both methods can effectively solve the problem of task placement and data migration in geographically distributed data centers,and improve the performance of geographically distributed data centers.Among them,the joint scheduling method mainly uses the hypergraph partition technology,which reduces the amount of data transmission between data centers while reducing the task completion time;while the improved graph neural network scheduling method uses the powerful learning ability of the graph neural network to further improve Scheduling efficiency for large-scale and multi-type tasks. |