A Study Of Trace-driven Resource Scheduling Simulator In Large-scale Cluster

Posted on:2024-02-05

Degree:Master

Type:Thesis

Country:China

Candidate:T Ling

Full Text:PDF

GTID:2558307103975189

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

The resource scheduling of a cloud cluster is the process of allocating cluster resources reasonably for application instances,mainly involving optimization techniques such as initial scheduling,rescheduling,parallel scheduling,and mixed scheduling.Scheduling has a significant impact on business performance,reliability,and resource utilization.The effectiveness of resource scheduling requires experimental verification,but conducting experiments on online clusters can easily lead to online business crashes and lack repeatability.Therefore,simulating the resource scheduling process in a cloud cluster has important practical significance.In order to solve the problem,this thesis designs and implements a large-scale cloud cluster resource scheduling simulation system.It operates based on cluster Trace data.It is able to restore and replay an online prodcution cluster.It provides users with configurable scheduling algorithm and rescheduling algorithm.Users are able to obtain the effect of their algorithm on the cluster in a repeatable manner.The main work content of this thesis includes:(1)The architecture and implementation of Lothar,a large-scale cluster resource scheduling simulation system.It runs based on Trace of production clusters.It includes event generation module,scheduling module,rescheduling module,core management module,performance evaluation and visualization module.It is able to restore and replay the continuous running of the cluster,simulate the arrival of resource requests,life cycles of instances and physical machines during runtime.By providing a scheduling algorithm and a rescheduling algorithm,you can obtain the application effect of the algorithm on the corresponding cluster.(2)Optimization of the simulation system.The system is event-driven,based on component registration and polling methods to speed up the simulation process and ensure that events occur in sequence.Realization of the time difference simulation scheme,the cost of simulation is lower and the accuracy is higher.Provides a two-stage algorithm interface for container placement and container migration.The algorithm interface is designed as a highly scalable distributed architecture.Optimization of the accuracy of migration simulation,it adds migration delay,resource limit,etc.It reproduces the conflict and failure between scheduling and migration process which widely exists in the cluster.Real-time visualization of cluster status and key performance indicators are provided.(3)Large-scale cluster scheduling experiments.In the first scheduling experiment,production cluster of Google was restored and played back,the two-level scheduling model(kube-scheduler and DCM)was compared with Borg scheduling.In the rescheduling experiment,the production cluster of Ant Group was restored and played back.The rescheduling effects of the Dot-Product algorithm and the DCM algorithm were compared and analyzed.

Keywords/Search Tags:

cloud computing, simulation system, resource scheduling, performance evaluation

PDF Full Text Request

Related items

1	Research On Dynamic Resource Scheduling Approach For SBS Application In Cloud Environment
2	Research On Resource Scheduling Algorithms In Cloud Computing
3	Research On GPU Parallel Computing And Application For HPC Cloud
4	Design And Implementation Of Multidimensional Resource Control Subsystem In Cloud Computing Platform
5	Scheduling Strategy Study And Performance Analysis In Cloud Computing
6	Multi-resource Power Aware Scheduling Oriented Simulation Tool,Algorithm And Research For Cloud Computing
7	Research On Key Technologies Of Resource Scheduling In Cloud Computing System
8	Research On Resource Scheduling Algorithms Based On The Multiobjective Optimization In Cloud Computing
9	Research On Virtual Resource Scheduling Strategy In Cloud Computing Environment
10	The Scheduling Methods Based On The Task Features And Resource Constraints In Cloud Computing