Font Size: a A A

A Smart Contract Ponzi Scheme Identification Method Based On Global Word Vector Representation

Posted on:2023-08-01Degree:MasterType:Thesis
Country:ChinaCandidate:C Q MaFull Text:PDF
GTID:2558306845499364Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In recent years,with the development of blockchain technology represented by Bitcoin and Ethereum,the research and application of related technology have been continuously upgraded.However,due to the technical characteristics of decentralization and anonymous encryption of blockchain technology,it is difficult for the market to establish an effective supervision mechanism for it.This attracts criminals to transplant Ponzi schemes into Ethereum smart contracts,which brings huge security risks and economic losses to Ethereum and its applications.Therefore,it is particularly important to identify the hidden Ponzi scheme as early as possible by detecting smart contracts.There is currently insufficient research on the identification of Ethereum smart contract Ponzi schemes.Earlier studies defined and identified Ponzi schemes through expert experience and heuristic rules,which were inefficiency and had high false positive rate.A series of methods based on machine learning rely heavily on feature extraction,and there is still a large room for improvement in their recognition accuracy.In order to overcome the limitations of existing work,it is necessary to mine the public information in smart contracts to further improve the performance of feature extraction and identification detection.This paper focus on the feature representation of smart contracts,mainly uses the contract code available on the network for research and proposes a smart contract Ponzi scheme identification method based on global vector for representation.This method treats the opcodes of smart contracts as words in natural language text.Then the vectorization problem of code sequences is transformed into the problem of text vectorization.By introducing weight coefficients on GloVe in natural language processing,a feature matrix with semantic information is constructed from the contract code.A variety of machine learning classification algorithms are further used for training,which realizes the detection of smart contract Ponzi schemes.To solve the problem of imbalanced data categories in the dataset of smart contract Ponzi schemes and the problem of prediction shift in the gradient boosting algorithm,this paper uses the focal loss function and the ordered boosting algorithm to improve the classification model.Experiments show that the improved model has higher F1 value and AUC value than before,and perform better than other imbalanced data classification methods.Finally,our model is tested in the real scene of the Ethereum network.Undiscovered Ponzi scheme contract was successfully identified,which also proved the effectiveness of the algorithm proposed in this paper.
Keywords/Search Tags:Ethereum, Smart Contracts, Ponzi Schemes, Code Feature Representation, Cost-Sensitive Loss
PDF Full Text Request
Related items