Software plagiarism detection using abstract syntax tree and graph-based data mining |
| Posted on:2006-01-15 | Degree:M.S | Type:Thesis |
| University:Oklahoma State University | Candidate:Hsiao, Hsi-Yue Sean | Full Text:PDF |
| GTID:2458390008474803 | Subject:Computer Science |
| Abstract/Summary: | PDF Full Text Request |
| Scope and method of study. This study is using a graph-based data mining technique to discover cases of software plagiarism. We hypothesize that repetitive patterns found in the abstract syntax tree (AST) representation of source code will only match such patterns of other source code if the author of both are the same. A graph-based data mining technique was used for analyzing the AST and extracting the patterns. The results from the data miner were compared using a graph matching algorithm, which provided the measure of similarity. We used artificial test sets and actual student assignments for evaluation.; Findings and conclusions. The experiments identified plagiarism behaviors in both artificial and real-world data. These findings proved the system to be feasible. This system can be applied to every kind of programming language that uses abstract syntax trees for compilation, and these ASTs can easily be extracted using the compiler. (Abstract shortened by UMI.)... |
| Keywords/Search Tags: | Using, Abstract syntax, Graph-based data, Plagiarism |
PDF Full Text Request |
Related items |