Font Size: a A A

Design And Implementation Of Log Big Data Service Platform Based On ElasticSearch And Storm

Posted on:2019-02-28Degree:MasterType:Thesis
Country:ChinaCandidate:L LuoFull Text:PDF
GTID:2348330542493906Subject:Circuits and Systems
Abstract/Summary:PDF Full Text Request
Log data is an indispensable role of communication network,Web and various application devedopment.It contains a lot of rich information itself and is also an important reference source for relevant staff to retrieve and diagnose system problems.Traditional log processing methods are based on a single point of server,scalability and storage have received a lot of restrictions.With the development of the big data related to technology,the scale of various distributed systems has been constantly expanding,and various services have become more and more complicated.As a result,the amount of log data has exponentially increased in unit time.At the same time,multi-source heterogeneity of log data and storage decentralization bring challenges to unified log collection,storage,real-time processing and in-depth analysis.On the other hand,in practical application,the traditional log processing methods can not satisfy the real-time processing and in-depth data mining of logs.And at present,enterprises are demanding more and more rapid retrieval of log data and analysis of response time.In response to the above problems faced in the context of big data,this topic studies the basic theory and distributed architecture of big data technology,and develops a log big data service platform based on ElasticSearch and Storm.The main research contents of this thesis are as follows:(1)Combined with the research background and the actual situation of the enterprise,the functional requirements and the overall architecture of the platform is defined.And it integrated Flume,Kafka,ElasticSearch,Storm,Zookeeper and other big data technology to achieve the cluster deployment of the overall architecture.(2)By using Flume system,it can collect logs from different sources in real time from the Flume client and pass it to the Flume Server.At the same time,we have improved the Channel component in the Agent.Platform testing shows that the improved Channel component can be effectively compatible with both MemoryChannel and FileChannel,and supports real-time acquisition of large-scale logs.(3)Based on the Kafka message queues,it has completed the caching and distribution of log data.It also has been set up replica allocation policies,allocation algorithms,and deletion policies in the Kafka cluster to achieve reliable log transmission and real-time consumption.(4)Integrate Kafka with ElasticSearch and Storm to achieve stable access of log data streams.ElasticSearch is used as the storage center and search engine of the log data,while supporting log aggregation analysis and statistical analysis.Storm is used as the real-time processing of log data flow,and complete the log matching alarm based on the rule.(5)A front-end visual management platform for log data sets is built,and it has used thrift technology to establish data interaction between application services and basic services,so as to enable users to log retrieval and analysis.On the basis of the above research,we completed the construction of the log big data service platform based on ElasticSearch and Storm,and put it into practical application of a company.Through test and actual application shows that this platform has stable performance,can effectively complete multi-source heterogeneous logs fast acquisition,reliable transmission and mass log storage,and integrates the function of log management,retrieval,alarm and analysis,the results meet the expected design idea.
Keywords/Search Tags:Big data, Distributed, Real-time processing, Storm, Log retrieval
PDF Full Text Request
Related items