Font Size: a A A

The Optimization And Realization Of Big Data Connection Processing Technology In The E-government Environment

Posted on:2019-07-09Degree:MasterType:Thesis
Country:ChinaCandidate:L L YuFull Text:PDF
GTID:2436330563957691Subject:Software engineering
Abstract/Summary:PDF Full Text Request
With the rapid development of Internet technology,e-government has been widely used in administrative institution.It has been widely received much attention from the leader how to analyze and extract useful information from the vast data collected by the government to help decision-makers efficiently extract useful information from the data to make informed scientific decisions.It is meaningful to study how to optimize the algorithm of the multi-dataset join query to improve the efficiency of data analysis and processing.This topic is based on the understanding and analysis of related theories and technologies of big data,and combined with the actual project background and the specific project requirements,we mainly completed the design and implementation of the data visualization analysis system in E-government environment and the optimization of query efficiency of two tables and multi-table equivalent connections involved in the system.In this paper,we firstly make an in-depth comparative analysis of the current big data processing technology from the principle and the use of occasions,and selected Hadoop processing technology as the core technology of data visualization analysis system based on actual project requirements.For the characteristics of the department data in e-government platform is decentralized,large and not standardized,through the analysis and comparison of the current mainstream data integration tools,sqoop is used as the system data integration loading part.Sqoop loads the data into Hive for data processing,and the processed data is saved to the HBase database.HighCharts is used as part of the data presentation tool because of its stable performance and better browser compatibility.Finally,each part is integrated through the configuration of the configuration file,and the task is operated by the time task of the system.In view of the low performance of the two tables and multi-table equivalent connectivity encountered in the visual analysis and processing,this paper proposes an optimization scheme of two tables and multi-table equivalent connection algorithm in big data environment.Firstly,we study how to use the improved Bloom Filter to filter irrelevant data efficiently during the map phase to reduce the amount of networktraffic.Then,based on the filtered irrelevant data,we study the equivalent connection algorithm of two tables and multi-tables in MapReduce.Finally,Verify the efficiency of the proposed algorithm.
Keywords/Search Tags:e-government, join operation, big data analysis, data integration, filter irrelevant data
PDF Full Text Request
Related items