| Hybrid cloud service providers(e.g.,Huawei Cloud,Google Cloud,etc.)are integrating the computing power of cloud data centers(cloud),edge nodes(edge),and end devices(end),aiming to build integrated cloud-edge-end orchestrated systems,and to form an overall scheduling scheme where tasks at end devices can be offloaded to the edge and the cloud while edge-cloud clusters handle these tasks.However,due to the expanding scale of the cloud-edge-end orchestrated system,as well as its structural complexity and resource dynamics,the challenge of “how to meet the online scheduling requirements without relying entirely on accurate system modeling” is becoming increasingly prominent: first,it is difficult to improve the utility of offload scheduling online;second,it is difficult to coordinate service resource competition online;and third,it is difficult to integrate the global load scheduling online.Therefore,in this paper,we design an online scheduling solution for hybrid cloud service providers to address the above challenges with the goal of optimizing service experience and system throughput.The specific research content and main contributions are as follows.To address the challenge that online offloading scheduling utility is difficult to improve in “the collaboration of end devices and the edge”: this paper proposes a solution based on deep reinforcement learning to generate task offloading and resource allocation decisions online,and designs an “In-Edge AI”(“In-Edge Artificial Intelligence”)framework that integrates federated learning and deep reinforcement learning to enable end devices to learn adaptive task offloading and scheduling strategies in dynamic scenarios with continuous fluctuations in endurance,network conditions,and task loads.Theoretically and experimentally,this paper evaluates and validates the convergence and effectiveness of the above offload scheduling strategy,and its offload scheduling performance that can approach that of centralized deep reinforcement learning.To address the challenge that service resource competition is difficult to coordinate online in “the collaboration of the edge and the cloud”: first,this paper proposes an online scheduling architecture for delay-sensitive services and generalizes a scheduling problem for service experience optimization;second,based on potential game theory,this paper investigates the finite improvement properties of the service resource competition game under this architecture,proves that the game can reach Nash equilibrium,and designs a low-complexity,decentralized,service entity online scheduling optimization algorithm based on this property;third,the efficiency ratio of the obtained solution to the optimal solution is analyzed following the Price of Anarchy,and the effectiveness of the designed algorithm and its scalability for the scale of the number of users are experimentally verified.To address the challenge that the online scheduling for global loads is difficult to integrate in “the collaboration of the edge and the cloud”: considering the complexity of modeling and scheduling for edge-cloud clusters in the cloud-edge-end orchestrated system,this paper designs an online scheduling framework for such edge-cloud clusters that can be based on learning algorithms to improve the long-term throughput rate of processing best-effort batch tasks.On the one hand,a multi-agent coordinated actorcritic algorithm is designed to cater for decentralized request dispatch at the edge and to tolerate dynamic scheduling action spaces;On the other hand,graph neural networks are used to embed system state information for different system scales and structures,and the embedding results are combined with multiple policy networks to reduce the service orchestration dimension through stepwise scheduling.Finally,based on the above series of algorithms,this paper proposes a two time-scale online scheduling mechanism to coordinate request dispatch and service orchestration,and can achieve compatibility with native Kubernetes components.In summary,this paper can provide a technical solution to optimize user service experience and system platform throughput for hybrid cloud service providers who expect to orchestrate the computing power of cloud data centers,edge nodes(clusters)and end devices.The proposed scheduling solutions comprehensively consider the timeliness,effectiveness and scalability of scheduling,and can be deployed in cloudedge-end orchestrated systems to perform online scheduling tasks. |