Font Size: a A A

The Optimization Of Scheduling Algorithm And Download Hadoop Platform Mechanism

Posted on:2013-03-26Degree:MasterType:Thesis
Country:ChinaCandidate:Y ZhouFull Text:PDF
GTID:2248330374489198Subject:Information and Communication Engineering
Abstract/Summary:PDF Full Text Request
In the Internet technology that is rapid development, data is growthing with explosive tendency. As the carrier of information, data assumes an increasingly important role with the development of information technology. The management straits of massive data, high cost of data storage, low liability and low security are hard to be solved recently.More and more enterprises begin to step into cloud computing domain, and use it for data distributed compute and management. Cloud computing service has the characteristics of high liability, high scalability and massive storage, therefore, the research on the system of cloud computing service is the tendency of IT technology. In order to increase the rate of data handle for hadoop, the HDFS and MapReduce inner operation mechanism is investigated in this thesis.For the heterogeneity of Hadoop’s running environment, in order to make Hadoop can reasonable assign the task according to the operation ability of each node, an improved adaptive load regulation scheduling algorithm is proposed. With the combination of Hadoop scheduling algorithm and system loading, realize the adaptive scheduling algorithm. Making some improvement to the original of the speculate execution algorithm, it can more accurate to find the straggler which is influence in the response time, greatly improve the rate to find straggler task and Hadoop can have a better performance in a Heterogeneous Environments.Concerning the problems such as low downloading efficiency and imbalanced load of HDFS for Hadoop, a distributed file parallel downloading algorithm is proposed. In combination with multi-threaded Peer-to-Peer(P2P) download idea, proposed an efficiency optimization algorithm from the aspects of data-block and file which is effectively improve the download efficiency of system. Based on the traditional parallel algorithm, a new predicted-speeds with the average download speed and the current speed to predicting the future download speed more accurate. Experiment results show that compared with Hadoop’s download mechanism, our algorithm can significantly improve system performance, which can meet the needs of the user’s download more quickly.
Keywords/Search Tags:MapReduce, HDFS, Speculate Execution, ParallelDownload
PDF Full Text Request
Related items