Font Size: a A A

Research On Key Techniques Of Event Log Sampling For Large-scale And Complex Business Processes

Posted on:2024-04-14Degree:MasterType:Thesis
Country:ChinaCandidate:S P ZhangFull Text:PDF
GTID:2558307136472774Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The information system collects a large number of business process event logs,and process discovery aims to discover process models from the event logs.The traditional process discovery method cannot efficiently process large-scale event logs.The event log sampling method provides a feasible scheme for efficient mining of large-scale event logs,but the existing sampling methods still have efficiency bottlenecks.In view of the problems existing in large-scale complex business process mining,this paper proposes to establish a relatively complete event log sampling technology system,and solve the problems existing in the traditional process discovery algorithm for processing large-scale complex event logs,make up for the inadequacy of the traditional event log sampling technology,and propose a set of efficient quality evaluation indicators to evaluate the quality of sample logs obtained by sampling technology.Starting from the event log data recorded in the business information system,this paper studies the problems existing in large-scale complex event logs,mainly including:1.For large-scale event logs,the traditional process discovery algorithm has low efficiency in processing.Therefore,a more efficient event log sampling technology,Log Rank++,is proposed.This method first determines the significance characteristics of the trace,such as activity,directly follow relation,then calculates the significance value of the trace to sort,and finally selects a group of the most significance traces to form a sample event log;2.For the specific model mining algorithm with noise mechanism,such as Heuristic Miner,this paper proposes a behavior-invariant event log sampling technology.This method includes three stages: selecting trace variants and frequencies by ratio,calculating the DF(Directly Follow relation)weight of the trace,and sampling based on set coverage,to ensure that the behavior of the process model mined with the sample event log and the original event log as input is consistent;3.For heterogeneous(complex structure)event logs,the existing sampling techniques have low sampling accuracy when processing.A more accurate event log sampling technique based on trace clustering is proposed.This method first decomposes the event log into a group of homogeneous sub-logs by the trace clustering method,then samples the sub-logs through the existing sampling methods,and then merges the corresponding sample logs of the sub-logs as the final sample log.Finally,the quality of the sample logs is evaluated from the perspective of process model mining;4.When evaluating the quality of sample logs,the traditional conformance checking methods have problems such as low efficiency and easy to be affected by existing discovery algorithms.A set of reasonable,efficient and accurate sample log quality evaluation indicators is proposed.This method first converts the event log into the corresponding behavior characteristic matrix,and then applies the log similarity method to calculate the similarity of the behavior characteristic matrix corresponding to the original log and the sample log.Finally,the quality of the sample log is evaluated by the matrix similarity value.
Keywords/Search Tags:Process mining, Model discovery, Log sampling, Trace clustering, Event log, Quality evaluation
PDF Full Text Request
Related items