| In recent years,with the development of social informatization,Internet services have penetrated into all aspects of people’s lives.Internet service providers are facing the challenge of increasing user experience requirements for Internet services.In the process of providing Internet services,various service businesses may encounter various problems at any time.The occurrence of these problems will affect the user’s experience and cause confusion to users.This requires companies to collect corresponding service logs to find out the problem and repair them in time.However,the logs of large scale services usually are massive.Therefore,how to effectively collect and analyze logs and provide timely feedback to R&D and maintenance personnel is a major issue the industry facing.For this,author proposed a set of service log analysis solutions based on Elastic Stack,which can effectively collect,analyze and display service logs.The main research content of the service log analysis system based on Elastic Stack includes: collection and preprocessing of service log raw data,real-time statistical analysis of log data,and storage and visualization of log data.The system is based on the Elastic Stack technology stack,combined with the Storm real-time computing framework and the Spark MLlib machine learning library,and the Kafka message queue to design and implement a reliable and efficient distributed collection,real-time statistical analysis,storage and display of massive service logs analysis system.In terms of the process,the system first configures the Filebeat component on each service node of the cluster for distributed collection of logs,then preprocesses the log data through Logstash,transmits the data to the Kafka message queue for caching,and consumes Kafka through Storm to perform log data.Real-time statistics,cluster analysis,data persistence through Elasticsearch and Redis,combined with Kibana visual log analysis result data.Among them,the data analysis module is based on the Spark MLlib machine learning library to realize the clustering analysis of the error log.The process includes data preprocessing of the error log text content,using the TF-IDF algorithm for feature extraction,based on the parallel K-means algorithm Perform clustering model training and optimization.The persistent model is a PMML file,and then the model file is called through Storm to achieve real-time cluster prediction.The service log analysis system based on Elastic Stack can effectively collect,process,and analyze service logs through the verification of service logs,and visualize the log analysis results,which is convenient for the development and maintenance of service R&D,and can be effective to improve service quality and work efficiency. |