| With the rapid development of the Internet industry,distributed cluster is the mainstream solution for enterprises to deal with large data volume and high concurrency scenarios.It ensures that other machines can provide access to the outside world when the machine fails.However,the management of huge clusters has become another problem..Clusters often have serious security risks due to network fluctuations,hardware abnormalities,and human damage.At present,enterprises implement cluster management and maintenance by monitoring hardware and collecting logs.However,this method is relatively passive and the monitoring is not comprehensive,resulting in limited operation and maintenance quality.Therefore,an intelligent operation and maintenance system can assist enterprises in various management.important to the business.This dissertation firstly improves the original log system,and embeds the intelligent emergency response module and the intrusion detection module based on the improved log module to realize multi-faceted auxiliary operation and maintenance of the cluster.The research and design of the operation and maintenance system in this paper are as follows:(1)Improve the existing log system.At present,the ELK log system commonly used by enterprises is not enough to support the multi-faceted auxiliary operation and maintenance of the cluster.This paper firstly improves the data analysis layer in the system.(2)In order to speed up the emergency disposal,a set of intelligent emergency disposal module is designed based on the improved log module.In the improved log module,the log keywords are extracted by the word frequency algorithm,and the similarity algorithm is used to find similar records in the historical alarm log record database maintained by the enterprise according to the extracted keywords.According to the content recommendation algorithm idea,the historical disposal methods associated with similar historical records are pushed to the staff to realize the intelligentization of emergency disposal.(3)In order to prevent internal employees from illegally operating the cluster or using resources to mine,this paper still designs a set of intrusion detection modules based on the improved log module.In the log module after improvement,by extracting the behavior characteristics of the user when operating the cluster,comparing the current behavior characteristics with the historical behavior characteristics to calculate the risk value,and using the risk accumulation strategy to accumulate the risk value,when the accumulated risk of the account reaches the threshold,the intrusion detection module The account will be blown off immediately and an email will be sent to the police to monitor whether employees have illegal operations. |