| The era of digital economy forces enterprises to continuously optimize their business processes to improve efficiency and market competitiveness.At the same time,with the development of informatization and the transformation of digitization,enterprise information system produces a large number of event logs in the process of business implementation,which contains valuable information such as enterprise resources,processes and process activities.Enterprise event log based analysis can identify upcoming business process activities,avoid resource waste caused by information error,or resource conflict caused by continuous execution of process instances,so as to better allocate resources and help them optimize business processes.However,the traditional feature extraction methods are inefficient for process activity recognition based on massive event logs.This paper studies the automatic process activity recognition method based on deep learning.The main contributions of this paper include:Aiming at the problem that process discovery directly based on complex event log will generate "Spaghetti" model and redundant process instances,this paper proposes a process activity recognition oriented clustering preprocessing framework iBelt.The framework defines "process connection belt" to describe the analysis results of event logs.Based on the idea of clustering tree,the model of“Clustering through Boosting Decision Tree(CLBDT)" is designed.Clusters can effectively reduce the complexity of event logs,provide more targeted process identification examples,and use the unsupervised feature selection method of square difference and discriminant feature analysis to solve the disadvantages of high-dimensional data affecting the interpretability of clusters.Experimental results show that the framework improves the clustering effect and process discovery quality of existing methods,and the clustering results are concise and easy to understand.Aiming at the problem that the traditional convolutional neural network can not better show the complex relationship between process activities,an iBelt-GCN model for process activity recognition based on graph convolution neural network(GCN)is proposed in this paper.When extracting features,this method takes different clustering clusters obtained by iBelt framework as the training sets,and generates adjacency matrix based on the event direct following graph.The graph structure contains the connection and execution of various activity types,and uses the comprehensive feature vector representation method to combine the time features of event trajectories.The experimental results show that among the graph convolution neural networks with four different adjacency matrices,the model accuracy of Laplace transform of weighted adjacency matrix is the best,but the performance is generally weaker than the baseline MLP model.The experimental verification prototype system for automatic identification of process activities based on event log data is implemented,which provides the operation and display interface of data preprocessing and feature extraction,model training and result visualization,including storing and calling data and adjusting related parameters. |