Font Size: a A A

Research And Implementation Of An MAS-based And Fault-Tolerant Distributed ELT System

Posted on:2014-07-25Degree:MasterType:Thesis
Country:ChinaCandidate:J L HuangFull Text:PDF
GTID:2308330461973936Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the process of the informationization, the enterprise has accumulated massive transaction data. These transaction data are heterogeneous, inconsistent, so they cannot be directly used by the enterprise and then they will lose the advantages of the competition. ETL tools transfer the daily transaction data to the decision data storing in the data warehouse in order to provide the reliable foundation of the enterprise’s management decision. The ETL tools face the mass of transaction data, and therefore, the efficiency and stability of the data warehouse ETL tools’execution is more and more important.The integration of massive data requires the higher reliability of the ETL system. In order to improve the stability and reliability of the data integration in ETL system and overcome the drawbacks in the single cooperative control servers, in this paper, we import the double-backup technology into ETL system and design a solution by the cooperative control servers’ Double-backup. When the ETL system suffers software and hardware errors, the main cooperative control server will be unable to provide users with the application services and then, the backup control server will take over the main cooperative control server to provide users with services continuously without interrupting the data integration. And in the meanwhile, to ensure the ETL system to focus on the transaction logic, we extract the log functions, form common module-log module and meet the requirements of the distributed system for log function.In this paper, we propose the algorithm named ETL_Batch to recover from the errors of the ETL jobs.We not only implement the cooperative control servers’ Double-backup and the cooperative servers will send the ETL jobs, which were not finished in this computing server, to other available computing servers without redoing the whole jobs so that it will save the time and recovering these ETL jobs with the algorithm—ETL_Batch can improve the efficiency of the recovery. The experiment results show that when the ETL system suffers an error in the process of executing the ETL jobs, the ETL_Batch recovery algorithm performs better.Based on the deep study of the muti-agent technology, cooperative control servers’ double-backup technology, distributed log management technology and the fault-tolerant processing technology in the ETL jobs, we design and implement an ETL fault tolerance system—FTEL to provide a high reliability of ETL system to the small and medium-sized enterprises.
Keywords/Search Tags:ETL, cooperative control server Double-backup, fault-tolerant, Distributed log, ETL_Batch
PDF Full Text Request
Related items