| While the development of blockchain technology brings us convenience,there are also many security problems.Blockchain trading is popular because of its anonymity,but it also leads to rampant illegal activities such as money laundering.There are a large number of noise nodes and unbalanced label proportion in the blockchain transaction data set,while the most advanced blockchain transaction detection technology EvolveGCN model has poor processing effect on the data set,and there are defects such as ignoring the graph structure information.Therefore,this paper proposes a balanced subgraph sampling algorithm for graph sampling,and applies it to RF-EvolveGCN model to solve the above problems.Specifically,the main research work of this paper is as follows:1.A balanced subgraph sampling algorithm for label scale imbalance data set is proposed.A sub graph sampling algorithm is proposed to reduce the time and space complexity of EvolveGCN in processing blockchain transaction data sets.At the same time,a balanced sub graph sampling algorithm composed of noise removal algorithm and proportional sub graph selection algorithm is improved:The selective noise removal algorithm preserves the important unknown nodes cleared by the EvolveGCN indifference removal algorithm and adds false labels;Compared with the example graph selection algorithm,it makes full use of the graph structure information ignored when EvolveGCN selects the training set to select the subgraph.Through three groups of ablation experiments,the effectiveness of the two sub algorithms and the whole of the balanced subgraph sampling algorithm are verified respectively.Taking the F1 sorce of illegal nodes as the experimental index,the balanced subgraph sampling algorithm is improved by 10.2%.Finally,through the comparative test,the best training set is selected,and the proportion of legal and illegal labels is 2:5.2.An RF-EvolveGCN model is proposed,which can be applied to scenes with a small number of label samples.RF-EvolveGCN solves the problem of low identification efficiency caused by small label samples and unbalanced proportion of different label nodes when EvolveGCN processes blockchain transaction data sets.RF-EvolveGCN model consists of two parts:pre training the feature importance adjustment of random forest,ranking the importance of features,and reducing the dimension and removing the input features that are not related to the prediction results;The replacement of random forest classifier replaces the logistic regression classification model with a random forest model with better classification effect for data sets with few label samples and unbalanced proportion of different label nodes.Then,through three groups of ablation experiments,it shows that the RF-EvolveGCN model improves the experimental index of illegal F1 sorce by 13.4%.Finally,combined with the balanced subgraph sampling algorithm,the SubRF-EvolveGCN is constructed.Compared with the multi model experiment,the experimental results show that the model improves the experimental index of illegal F1 sorce by 16.4%.3.A blockchain transaction anomaly detection tool with data analysis and processing module,comparison algorithm effect display module and anomaly detection node marking module is constructed.Taking the SubRF-EvolveGCN proposed in this paper as the core model,set icon analysis tool and other benchmark model construction tools.Finally,the module function and tool performance of the tool are tested,and the test results are in line with the expectation. |