Font Size: a A A

The Reaserch And Optimize Of A New Hadoop Job Scheduling Algorithm Based On MLFQ

Posted on:2015-03-04Degree:MasterType:Thesis
Country:ChinaCandidate:Z Z QiFull Text:PDF
GTID:2308330482957275Subject:Computer software and theory
Abstract/Summary:PDF Full Text Request
Cloud computing, which is put forward by Google, Amazon etc, has been widespread concerned in the industry and academia. In recent years, the cloud computing technology has got enormous development with the efforts of industry and academia. Many cloud computing systems have been widely used based on the Hadoop platform. Hadoop is an open source platform for cloud computing framework, the biggest advantage is to achieve parallelism transparency to developers, to enable programmers to develop cloud computing applications without caring about the details of parallelism, which is completed at the bottom by the Hadoop framework. Job scheduling is one of the cores in Hadoop platform, the main function is to control and allocate the computing resources of cluster, which is directly related to the utilization of the overall performance of Hadoop platform and system resources. The thesis is based on Hadoop 1.2.1 platform. First, the mechanism Hadoop job scheduling is studied. Then, a new algorthm of Hadoop job scheduling is proposed. The main contributions of the thesis are as follows:1. A new job scheduling algorithm is proposed based on multi-level feedback queue (MLFQ), called MLFQ Job Scheduling Algorithm. The MLFQ was firstly applied in the operating system and achieved good results. Then the improved algorithm is used to schedule jobs in Hadoop platform, solved the problem that a small job cannot get a fair scheduling and enhanced the overall performance of the platform.2. A Map Task delay scheduling algorithm is proposed to solve the problem of poor local data locality. A Reduce Task delay scheduling algorithm is used to solve the "Reduce Slot Hoarding Problem" caused by scheduling jobs strictly follows MLFQ job scheduling algorithm.3. The performace of MLFQ job scheduling algorithm Map Task delay scheduling algorithm and Reduce Task delay scheduling algorithm is tested on a Hadoop platform with four nodes are constructed at the end of the thesis. The results show that the MLFQ job scheduling algorithm can shorten the average response time for small jobs, the Map Task delay Scheduling algorithm and Reduce Task delay Scheduling algorithm can improve the throughput of the system on this basis.
Keywords/Search Tags:Cloud Computing, Multi-level Feedback Queue, Job Schedule, MapReduce, Delay Schedule, Hadoop
PDF Full Text Request
Related items