Font Size: a A A

Government Websites Distributed Log Storage And Analysis System

Posted on:2015-01-28Degree:MasterType:Thesis
Country:ChinaCandidate:B HanFull Text:PDF
GTID:2268330431969399Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the development of information and communication technologies, E-government isincreasingly seen as a framework of beyond conventional public service. Every country of theworld is studying how to improve the quality of government websites and online services, withwhich to improve government efficiency and to increase the government·s credibility.Combining the current development of China·s government websites and Hadoop-basedtechnology, the thesis proposes a method which can get the user·s point of interest and behaviorpatterns by collecting and analyzing the access logs of websites. And using this method, thethesis can provide a data support for the construction and maintenance of government websitesand improve the quality of services of the government websites. This thesis makes a research inthis paper from the following aspects.(1) Describe the current situation of China·s e-government development and the problemwhich need to be solved, and demonstrating the importance to analyze logs of governmentwebsites. Summarizing the current research in log analysis of home and abroad, the thesisproposes the objectives and framework.(2) This thesis introduces some distributed technology currently used from two aspects ofdistributed storage and processing including GFS and HDFS. And we also study the model ofparallel programming called Map Reduce.(3) This thesis makes a detailed analysis about the system requirements and divides the systeminto log collection, log storage, log analysis and other functional models. The thesis also designsthe overall structure of the system and proposes a hierarchical structure which is suitable forcollecting logs from many webs of government and the function of each layer is defined.(4) This thesis demonstrates the detailed method of collecting logs from the local logcollection and distributed collection. The method of local log collection combines Web Beaconwith currently used JavaScript tag to achieve cross-domain log storage. The distributed logcollection can be designed by using the distributed log collection system of Apache Flume whichis open-source.(5) Respectively, This thesis discusses the method about how to design the model of thedistributed storage and distributed processing of logs. By analyzing the requirements of storageof government website logs and combining the characteristics of the HBase, this thesis selectsthe distributed database of HBase as the storage platform of logs from government websites andalso designs the structure of table of log database. Using MapReduce mode l and the interface ofHBase, the thesis realizes the method which can process the distributed log.
Keywords/Search Tags:E-government, MapReduce, Distributed Collection, DistributedStorage
PDF Full Text Request
Related items