Font Size: a A A

Research On Instruction-words Based Software Birthmarking

Posted on:2011-03-03Degree:MasterType:Thesis
Country:ChinaCandidate:L ChenFull Text:PDF
GTID:2178330332978406Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
The software birthmarking for Java program is discussed in this thesis mainly. Based on the brief introductions about the status of software birthmarking and the classic k-gram birthmark, three kinds of software birthmarks are proposed and a new method to evaluate the integrated performance of software birthmark is proposed. The contents in the thesis are as follows:(1) Some references used F-measure curve to evaluate the integrated performance of software birthmark. However, there exist some problems such as singularity phenomena and limitation on the scale of the sample set. Therefore, the union of robustness and credibility curve is proposed to instead F-measure curve. The experimental results show that the method can preferably evaluate the integrated performance of software birthmark.(2) Due to the problems that the robustness of the classic k-gram birthmark is not ideal enough and the frequency of k-gram fragments has influence on the comparison between the programs, the frequency of k-gram based software birthmark, which comprised of the k-gram fragments and their frequencies, is proposed. In the experiment of the comparison between the Java class files, frequency of k-gram based software birthmark has better performance on the credibility, robustness and integrated capability than the classic k-gram birthmark.(3) The k-gram algorithm is a mechanical cutting way, so the cutting result of the instruction sequence cannot reflect the semantics, and then, the concept of instruction-word referring to the word segmentation in the document copy detection of natural language is put forward in this thesis. Taking into account that the"instruction-word"is a stable instruction sequence, the software birthmark based on frequencies-statistic instruction-words is proposed. The experimental results show that the proposed software birthmark has a good credibility but the general robustness.(4) Due to the problems that the establishment algorithm of the frequencies-statistic instruction-word library requires to set an experience parameter and the forward maximum matching method may easily affected by the semantics-preserving transformations, a software birthmark based on conditional probability analysis instruction-words is proposed considering the internal relation in an instruction sequence. The experimental results are as follows. Compared with the k-gram based software birthmark, the frequency of k-gram based software birthmark and the frequencies-statistic instruction-words based software birthmark, the proposed birthmark performs as well as the frequency of k-gram based software birthmark, but better than the other two birthmarks on comparison between the Java class files. When comparing between the Java packages, the proposed birthmark performs better than the others do.Finally, conclusions are drawn, and the further researches are put forward.
Keywords/Search Tags:software theft detection, software birthmarking, instruction-word, conditional probability, k-gram algorithm, semantics-preserving transformations
PDF Full Text Request
Related items