| With the development of computer technology, cloud computing, as a new computingmodel,has had a huge influence in a short few years. Hadoop is a distributed computingplatform which uses master/slave framework to support developing and processing huge dataparallelly. In the cloud computing cluster, the master nodes sometimes occur anomalies andinterrupt, so that how to recovery the master nodes was concerned by the academic circle andpress. In addition, according to the MapReduce work mechanism, the system divides tasksinto several subtasks, and assigns those subtasks to different physical nodes for processing.Thus,how to assign the subtasks is also concerned by IT research workers. Based on thestudy of Hadoop, frame of MapReduce and related theory of intelligent computing, this paperfocus on studying the two problems of the MapReduce mentioned above. The concretecontents are specified as follows:This paper detailedly discusses and analyzes the reasons of the data jam of the existedmechanisms such as the recovery from history logging,synchronization and dropping. Thispaper establishes a more effective mechanism which has combined with the characteristics ofthe recovery from history logging,synchronization and dropping, based on the comprehensiveconsideration of the utilization rate of the memory space and the efficiency of the parallelplatform. The system firstly recovers form history logging. After that, it uses heartbeatwithout carrying information to summarize a list of working node. Then, the recovered systemdiscards the working node that hasn’t sended heartbeats to it. The experimental results showthat the new mechanism improves the performance and efficiency and relatively reduceRecovery Time and times of anomalies.The paper proposes an improved genetic algorithm with two fitness to resolve the issuethat how to schedule the tasks efficiently. This algorithm not only considers the Total TasksTime but also takes the variance of Task Time into account. The experimental result proofsthat it can effectively reduce the variance of Task Time under circumstance that the TotalTasks Time has not been increased remarkably, so that it resolves the problem that the waitingtime of some clients is too long and enhances the efficiency of the parallel computingplatform and comprehensive satisfaction of users. |