Font Size: a A A

Workload Characteristic Analysis And Container Performance Prediction Of Colocated Clusters

Posted on:2020-08-29Degree:MasterType:Thesis
Country:ChinaCandidate:Z J CuiFull Text:PDF
GTID:2428330605967981Subject:Computer technology
Abstract/Summary:PDF Full Text Request
Cloud platform has developed into a basic platform for providing computing services.The resources required for online tasks to run on clusters vary greatly.based on the requirements of ensuring the quality of service of online tasks,clusters will allocate the resources needed at their peak to online tasks,the resource utilization of the cluster stay at a low level.In order to improve the resource utilization of clusters,more and more cloud service providers mix offline tasks with online tasks to form colocated clusters.Offline tasks in the cloud platform are bound to compete with online tasks for cluster resources.The current isolation mechanism can not prevent other tasks from seizing the resources of the current task,resulting in a decline in the quality of service of online tasks.In order to solve the above problems,this paper starts with the workload characteristic representation of the colocated cluster,analyzes the workload change of the mixed task in the colocated cluster,and then proposes alogrithm to predict the performance of online tasks on the cluster.Finally,an online task performance classification model in colacated cluster is proposed.The main contribution of this paper is divided into the following three parts:(1)In view of the characteristics of many types of tasks,large number of tasks and mixed deployment on cloud platforms,this paper takes the log data set of Alibaba colocated cluster as the research object to analyze the workload changes of tasks in the colocated cluster.The research presents that the CPU resource utilization of online tasks is consistent with the periodic fluctuation of one day,while the memory resource utilization has been stable at a high level,indicating that the memory resource has become the resource bottleneck of the cluster.The CPU and memory resource utilization of offline tasks also conform to the periodic fluctuation of one day,but the time of their peaks and valleys is different from that of online tasks.The peak and valley times of CPU utilization of offline tasks are 6:00 and 23:00,and the peak and valley times of online tasks are 11:00 and 5:00.(2)In order to solve the problem that the performance of containers in clusters is disturbed by many factors,this paper proposes to use LGB(Light Gradient Boosting,LGB)model to predict the performance of containers.Firstly,the reasons that affect the container performance are divided into static factors and dynamic factors,single-container factors and multi-container factors,and the LGB model is used to predict the container performance.The model is verified by the running log data set of the physical machine running online tasks.The experimental results show that the dynamic factors and single container factors have a great influence on the container performance.It presents that the dynamic resource use of the container and the running characteristics of the task on the container have a larger influence on the performance of the container.The mean square error of using LGB model to predict container performance is 0.009,which is better than the experimental comparison model LR(Linear Regression,LR),Lasso(Least absolute shrinkage and selection operator,Lasso)and CART(Classification And Regression Tree,CART)decision tree.The experimental results show that the proposed algorithm has higher accuracy and better stability.(3)In view of the interference of offline tasks on container performance in colocated clusters,this paper proposes to transform the interference of offline tasks on container performance into a classification problem,and judge the container performance under the interference of offline tasks by predicting the category of classification problems.Firstly,the factors that affect the container performance in the colocated cluster are extracted,and then the PCA-GNB(Principal Component Analysis and Gaussian Naive Bayes,PCA-GNB)model proposed in this paper is used to predict the container performance on the colocated cluster.Finally,the PCA-GNB model proposed in this paper is verified on the log data set of the colocated cluster,and the experimental results show that the prediction accuracy of the PCA-GNB model is 97.95%,which achieve higher accuracy and better model stability than logistic regression model and support vector machine model.The above research results provide a reference for studying the interference of offline tasks to online tasks in colocated clusters,and useful to improve the efficiency of resource management of colocated clusters.
Keywords/Search Tags:colocated clusters, container, performance prediction, workload characteristics analysis, resource management, machine learning
PDF Full Text Request
Related items