Font Size: a A A

Research On Real Time ETL Elastic Scheduling Mechanism

Posted on:2021-01-05Degree:MasterType:Thesis
Country:ChinaCandidate:X L LiuFull Text:PDF
GTID:2370330605453518Subject:Software engineering
Abstract/Summary:PDF Full Text Request
In the current business environment,people prefer to use the latest data to guide business analysis.Therefore,real time ETL(Extract-Transform-Load)is proposed to address the needs.Nowadays,there are many researches on real time ETL.However,we have noted that proposed solution doesn't consider the dynamicity or even the elasticity of data.In many actual scenarios,the data production speed of the data source fluctuates over time and the fluctuation range is wide.And in the real time ETL system,ETL process is resident application program.If the resources of the ETL process runtime can be elastic scheduled,it enables the improvement of platform resources utilization.Therefore,this paper discusses the elastic scheduling mechanism on the real time ETL system in the case of large data fluctuations.The main work of this paper is divided into two parts: one is the study of the elastic scheduling mechanism of multiple ETL processes for a single user;the other is the study of the elastic scheduling mechanism of a single ETL process.In terms of the elastic scheduling of multiple ETL processes for a single user,this paper first designs a time series prediction model to predict the future data production speed of each ETL process;Then,according to the resource list submitted by each ETL process,urgency based dynamic scheduling algorithm is proposed to schedule the resources of multiple ETL processes for a single user.In terms of the elastic scheduling of a single ETL process,according to the load of each server,this paper first proposes a greedy load balancing algorithm to ensure that the load can be balanced;Then,improved algorithm based on Dinic is proposed to solve the stable matching problem between ETL services.Compared with traditional scheduling mechanism,experimental results show that,without affecting the speed of consumption data,elastic scheduling this paper proposed enables the improvement of resources usage.Besides,this paper also verifies the related solutions has better performance on the corresponding problems.
Keywords/Search Tags:Real Time ETL, Elastic Scheduling, Resource Scheduling, Stable Matching
PDF Full Text Request
Related items