Font Size: a A A

Research For Some Parallel Scheduling Problems In MapReduce

Posted on:2020-03-04Degree:MasterType:Thesis
Country:ChinaCandidate:C ZhengFull Text:PDF
GTID:2370330572461828Subject:Mathematics
Abstract/Summary:PDF Full Text Request
MapReduce is a popular batch data processing framework introduced by Google,which is widely used in industry and academia for processing large-scale data.This thesis mainly considers scheduling on parallel machines in MapReduce systerm,including optimal solution algorithms on uniform machines and lower bounds and online algorithms for the(semi-)online scheduling on identical machines.In this thesis,we assume that the map tasks are fractional and the reduce tasks are preemptive.Our goal is to minimize the maximum completion time of all the jobs,i.e.,makespan.The thesis consists of five chapters.In Chapter 1,we mainly introduce the related concepts and basic knowledge of scheduling problem,and the background and previous results on MapReduce scheduling.In Chapter 2,we consider the MapReduce scheduling on uniform machines.We provide an optimal solution algorithm for the problem on three uniform machines is given by the analysis on the types of the instance.In Chapter 3,we consider the lower bound of online MapReduce scheduling on m identical machines.We show that the competitive ratio of any online algorithm for the problem is at least 1.7135.In Chapter 4,we study semi-online scheduling on two identical machines with known the total size of all the jobs.We show that the competitive ratio of any online algorithm is at least 4/3 no matter whether the reduce tasks are preemptive or not.We further provide an optimal online algorithm with a competitive ratio of 4/3.In Chapter 5,we summarize our results and give some remarking conclusions for the future study.
Keywords/Search Tags:MapReduce, parallel machine Scheduling, makespan, algorithm, lower bound
PDF Full Text Request
Related items