The Improvement Of Recovery Mechanisms And Task Scheduling Based On Mapreduce

Posted on:2014-09-28

Degree:Master

Type:Thesis

Country:China

Candidate:X Zhang

Full Text:PDF

GTID:2268330398990266

Subject:Computer technology

Abstract/Summary:

PDF Full Text Request

With the development of computer technology, cloud computing, as a new computingmodelï¼Œhas had a huge influence in a short few years. Hadoop is a distributed computingplatform which uses master/slave framework to support developing and processing huge dataparallelly. In the cloud computing cluster, the master nodes sometimes occur anomalies andinterrupt, so that how to recovery the master nodes was concerned by the academic circle andpress. In addition, according to the MapReduce work mechanism, the system divides tasksinto several subtasks, and assigns those subtasks to different physical nodes for processing.Thusï¼Œhow to assign the subtasks is also concerned by IT research workers. Based on thestudy of Hadoop, frame of MapReduce and related theory of intelligent computing, this paperfocus on studying the two problems of the MapReduce mentioned above. The concretecontents are specified as followsï¼šThis paper detailedly discusses and analyzes the reasons of the data jam of the existedmechanisms such as the recovery from history logging,synchronization and dropping. Thispaper establishes a more effective mechanism which has combined with the characteristics ofthe recovery from history logging,synchronization and dropping, based on the comprehensiveconsideration of the utilization rate of the memory space and the efficiency of the parallelplatform. The system firstly recovers form history logging. After that, it uses heartbeatwithout carrying information to summarize a list of working node. Then, the recovered systemdiscards the working node that hasnâ€™t sended heartbeats to it. The experimental results showthat the new mechanism improves the performance and efficiency and relatively reduceRecovery Time and times of anomalies.The paper proposes an improved genetic algorithm with two fitness to resolve the issuethat how to schedule the tasks efficiently. This algorithm not only considers the Total TasksTime but also takes the variance of Task Time into account. The experimental result proofsthat it can effectively reduce the variance of Task Time under circumstance that the TotalTasks Time has not been increased remarkably, so that it resolves the problem that the waitingtime of some clients is too long and enhances the efficiency of the parallel computingplatform and comprehensive satisfaction of users.

Keywords/Search Tags:

Cloud Computing, MapReduce, Mechanism of Recovery, Task Scheduling

PDF Full Text Request

Related items

1	Design And Implementation Of The Failure Recovery Mechanism In MapReduce
2	The Research Of Task Scheduling Algorithm For Mapreduce Framework In Cloud Environment
3	The Research On High Performance Task Scheduling Technology Based On Mapreduce In Cloud Computing
4	Optimization And Research On Task Scheduling Algorithm In Cloud Computing
5	Research And Improvement Of MapReduce Scheduling Mechanism On Cloud Computing
6	Research On Cloud Task Scheduling Algorithms Based On Mapreduce
7	Research Of Task Scheduling And Results Recovery Strategy In Cloud Service
8	Research On Static Task Scheduling Mechanism In Cloud Computing Environment
9	Research Of Task Scheduling System In Electric Power Cloud Computing
10	Cloud Computing Task Scheduling And Scheduling Optimization Decision Problems In The Research