| Bitcoin transactions are decentralized,anonymous and other characteristics,which are mixed with all kinds of fraud,money laundering and insider trading and other illegal acts,if not detected and stopped in time,will seriously disrupt the financial order.However,the underlying technology blockchain of Bitcoin has the characteristics of decentralization,data encryption and transactions mostly retaining only hash,which brings many difficulties to the supervision of relevant financial regulatory departments.Machine learning is a general data processing technology and has the characteristics of good generalization ability and strong self-learning ability.Therefore,this thesis proposes an integrated learning based on parallel random forest,logistic regression and KNN for bitcoin abnormal transaction detection model,and the main work and innovation points accomplished in the thesis are as follows:(1)On the basis of summarizing the results of related work on Bitcoin abnormal transaction detection,common methods for Bitcoin abnormal transaction detection are compared and analyzed.The nature of anomalous transactions and the process of anomaly detection are studied,the basic principles of the integrated model are elaborated,and the method used in this thesis is determined based on the analysis results.(2)In response to the problem that the random forest algorithm,which is more effective in bitcoin anomaly transaction detection,has low time efficiency,a parallelism-based random forest algorithm is proposed.The algorithm shortens the training time with the help of multi-process parallel computing.Simulation experiments show that the training time decreases gradually as the number of processes increases,and when the number of processes increases to a certain number,the time decrease is no longer significant.When using 8 processes to train random forest in parallel,the time decrease rate can reach about 85%.(3)Based on the analysis that parallel random forest improves the time efficiency,in order to further improve the accuracy of bitcoin abnormal transaction detection,this thesis proposes a heterogeneous integration model based on parallel random forest,logistic regression,KNN,and calculates the transaction score and rank based on the classification results.First,the dataset is randomly partitioned into a training set and a test set in the ratio of 7:3,which improves the randomness of each set and reduces the bias.Then an integrated learning model based on parallel random forest,logistic regression and KNN is used.The simulation experimental results show that the proposed model in this thesis has a precision value of 0.9865,a recall value of0.9032,and an F1 value of 0.9430,which effectively improves the accuracy of the model.Finally,in order to display each transaction more intuitively,a scoring formula is proposed to calculate the security score and transaction level of each bitcoin transaction based on the classification results. |