Font Size: a A A

Research On Bytecode Vulnerability Detection Technology Of Smart Contract Based On Code Similarity

Posted on:2023-07-22Degree:MasterType:Thesis
Country:ChinaCandidate:D ZhuFull Text:PDF
GTID:2558306623497164Subject:Engineering
Abstract/Summary:PDF Full Text Request
In recent years,the number of smart contracts in the blockchain has increased explosively,partly because code reuse can improve development efficiency.However,it also accelerates the spread of vulnerabilities and brings great harm to the blockchain.Through code similarity detection,we can quickly locate and track smart contract vulnerabilities.According to statistics,the vast majority of smart contracts only publish bytecode.Therefore,the use of code similarity is of great significance to the vulnerability detection of smart contract bytecode.There are two difficulties in bytecode similarity detection of smart contracts.One is the diversity of bytecodes brought by optimization options.The execution of smart contracts requires a cost called GAS.When compiling and generating bytecodes,the optimization options will optimize the contracts to reduce the gas required for contract execution.Second,the bytecode diversity brought by different compiler versions.The compiler version update iteration of smart contract development language Solidity is too fast,reaching hundreds of versions.In order to solve the above difficulties,this paper proposes a set of solutions.The main contents and research results are as follows:(1)A method of embedding smart contracts across the basic blocks of optimization options based on triplet network is proposed.The compiled optimized and uncompiled optimized basic blocks of the same source code are regarded as basic block pairs.After the extraction of logical operation code and the standardization of operation code instructions,the basic block pairs are matched,and the formed basic block pairs are put into the pre-training model.On this basis,the negative sampling method is improved to obtain negative samples.The obtained triplet samples are put into the triplet network for training.The embedding accuracy of the final basic block is 97.8%,which makes the compiled optimization of the same source code closer to the basic block without compilation optimization in vector space,so as to improve the accuracy of bytecode similarity detection across optimization options.(2)A bytecode feature extraction method based on simulated execution control flow graph(CFG)is proposed.After the common program semantic mode is adapted to the smart contract operation code,the key instruction combination characteristic information is obtained by simulating the construction of Ethereum virtual machine storage structure and the sequential execution of instructions combined with the CFG decompiled by bytecode,so as to cope with the changing operating environment during the execution of bytecode and improve the accuracy of similarity detection.(3)A graph embedding neural network model based on attribute control flow graph(ACFG)is proposed.The basic block features,basic block sequence features and key instruction combination features are fused to improve the bytecode matching accuracy.Then,the ACFG containing three features is embedded into the vector space by using structure2 vec graph embedding,and the best embedding effect is obtained through siamese network training to measure the bytecode similarity.Through the comparative experiment with other methods,this paper improves the similarity measurement effect of smart contract bytecode.The bytecode similarity measurement method combined with the constructed vulnerability library can realize the homology vulnerability detection of smart contract bytecode.After experimental analysis,compared with the existing work,the accuracy of this paper is improved by 3.37% and the detection time is greatly reduced,which verifies the effectiveness of this method.
Keywords/Search Tags:Smart contract, Code similarity, Triplet network, Graph embedding, Vulnerability detection
PDF Full Text Request
Related items