Font Size: a A A

The Log Pattern Cluster Mining Algorithm Based On Prefix Tree

Posted on:2015-02-10Degree:MasterType:Thesis
Country:ChinaCandidate:M Q ZhangFull Text:PDF
GTID:2268330425484668Subject:Computer technology
Abstract/Summary:PDF Full Text Request
21st century is the network and information era. Both individuals and businesses are dependent on the Internet, therefore, Internet security and privacy has become a major concern of people in modern times. Log data records activities of network devices in real time, which has great significance on taking evidence of network attacks, hacker attacks and other event categories. By means of log data, operation and maintenance technicians can monitor the health of systems and networks in time. In addition, other conditions such as the user’s usage can also be observed. However, log data is usually extensive and elusive. Therefore, mining and extracting useful knowledge from log data is quite necessary.Due to the diversity of network devices, it’s time-consuming to examine them one by one. This paper carries out a systematic study on the distributed collection and centralized storage architecture of log data. Log messages are classified into different categories and stored in a centralized Syslog server, which facilitates centralized log data management and statistical analysis. Through the mining of log data on Syslog server, the frequent and non-frequent patterns of user behavior can be obtained.By analyzing log data characteristics and association rule mining algorithms, this paper improves the previous log data mining algorithms and proposes an improved log-pattern clustering algorithm (ILC algorithm). Then combining prefix tree with ILC algorithm, a log-pattern clustering algorithm based on prefix tree (PTLC algorithm) is designed. Finally, the concept of byte offset is applied to ILC algorithm and PTLC algorithm, and BILC algorithm and BPTLC algorithm are obtained as a result. The experiment result proves that these four algorithms save more time and space than traditional log pattern clustering algorithms, and the performance has been improved to a great extent.
Keywords/Search Tags:Data mining, Syslog server, Association rules, Clustering, Prefix tree, Byte offset
PDF Full Text Request
Related items