The rise of container technology and the promotion of Cloud Native concepts have brought new changes to the field of cloud computing infrastructure.As the de facto standard for container orchestration in the Cloud Native era,Kubernetes has been widely used to deploy and manage largescale container clusters.It supports the rapid construction,deployment and operation of enterprise applications by taking advantage of the cloud computing delivery model in the heterogeneous cloud platform environment.However,Kubernetes lacks a strong multi-tenant model and a hierarchical resource model,and only supports the flat division of cluster resources at the logical level by Namespace,which makes Kubernetes unable to achieve hierarchical management and control of finegrained resources in multi-tenant scenarios.At the same time,the lack of monitoring of the Kubernetes scheduling process makes it difficult for developers to observe the cluster scheduling process in real time and locate performance bottlenecks.In addition,Kubernetes only supports sequential scheduling of a single queue and lacks a fair scheduling policy,resulting in unfair resource allocation for different users and increased container scheduling delays.To solve the above problems,this paper conducts the research on the scheduling system under complex load based on Kubernetes.The specific research contents are as follows:First of all,in view of the lack of hierarchical resource model in Kubernetes,which makes hierarchical resource management impossible in multi-tenant scenarios,this paper proposes two resource objects based on the Kubernetes API:Queue and QueueBinding.Queue is mapped with the enterprise tree organization structure and supports the hierarchical resource management requirements of various loads in multi-tenant scenarios.QueueBinding associates abstract resource objects with system resources which reduces system coupling.At the same time,this paper expands the resource management function of Queue at the scheduling level,including the resource sharing function and resource exclusive function in the hierarchical structure,to further improve the availability of the resource model and the observability of the scheduling system.Afterwards,in order to solve the problem of real-time observation of the cluster scheduling process and positioning performance bottlenecks caused by the lack of hierarchical resources and performance monitoring in Kubernetes,this paper designs scheduling metrics and Queue dimension resource metrics based on time series data to describe the performance of scheduling plugins and uses Prometheus and the Grafana to achieve dynamic collection,monitoring alarms and visual display of custom metrics.In the next part,aiming at the unfair scheduling problem of Kubernetes in multi-tenant scenarios,this paper studies and compares the multi-resource fair scheduling algorithm and finally adopts the H-DRF algorithm which evaluates pods based on factors such as the weight of the hierarchical resource model Queue and the ratio of various resource consumption.In the end,based on the above model and algorithm research,this paper designs and implements a scheduling system under complex load based on Kubernetes.Experimental results show that the scheduling system developed in this paper makes up for the shortcomings of Kubernetes in hierarchical resource management and fair scheduling,and shows good scheduling performance. |