Font Size: a A A

Optimization And Research Of Speculative Execution Strategy In Hadoop Fault Tolerance Mechanism

Posted on:2019-03-15Degree:MasterType:Thesis
Country:ChinaCandidate:D D JinFull Text:PDF
GTID:2370330545970258Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
With the continuous development of network information technology and explosive growth of Internet data,traditional data platforms can no longer satisfy the storage and pro-cessing of existing massive data,making distributed computing face new opportunities.Ha-doop is an open source distributed parallel computing platform based on the distributed stor-age platform HDFS and the computational framework MapReduce,which makes the storage and computation of massive data becoming possible.However,due to the inevitable applica-tion of hardware failure during the actual application process,how to ensure the robustness of the platform's storage and calculation has become a research hotspot in academic circles at home and abroad.This paper mainly studies the guarantee of computation robustness,that is the optimization of computational fault-tolerance mechanismsSpeculative execution is an important means for ensuring the accuracy of computational fault tolerance mechanism.It achieves the goals of reducing task execution time and improv-ing the cluster throughput through discoverig the unusually slow tasks,which is called"Straggler" and speculative copies of these tasks on the back_up nodes to be executed.This paper focuses on the optimization of speculative execution mechanisms.The purpose is to improve the accuracy of the judgment of the "Straggler "and back it up to the node with better computing performance to achieve the goal of saving cluster resources.The main tasks in-clude:(1)In order to improve the accuracy of the“Straggler”determination in the speculative execution,an optimization strategy called "LWR-SE" was proposed,which predicts the task remaining execution time based on locally weighted regression.Through the real-time acqui-sition of the progress and execution time information in the task running process,it was found that there is a local linear relationship between task progress and execution time.The local weighted linear regression algorithm is introduced to predict the remaining time of the task in real time.At the same time,the execution time of the backup task is estimated in stages,and the overall benefit of the speculative execution strategy is ensured in combination with the cost-benefit model.The experimental results show that LWR-SE performs better in predicting the remaining time of the real-time tasks and improves the job execution time and cluster throughput compared with the classical speculative execution algorithm.(2)Since LWR-SE does not consider the task scheduling of backup task,which will cause-the problem of low utilization rate of nodes,a hybrid resource scheduling strategy in speculative execution based on non-cooperative game theory(HRSE)is proposed.It trans-formes the resource scheduling in speculative execution into a multi-party non-cooperative game model.The input of the model is the backup task and the original tasks in the cluster,then the possible execution node can be obtained by the benefit calculation.Finally,the bal-anced scheduling scheme is obtained by finding the Nash equilibrium solution from the pos-sible execution node.The experimental results show that the HRSE strategy can not only ef-fectively reduce the overall task execution time,but also improves the node utilization com-pared with the classical speculative execution algorithm.
Keywords/Search Tags:Hadoop, Computational Fault Tolerance, Speculative Execution, Locally Weighted Linear Prediction, Resource Scheduling
PDF Full Text Request
Related items