Font Size: a A A

Research On The Application Of The Document Copy Detection

Posted on:2011-04-12Degree:MasterType:Thesis
Country:ChinaCandidate:S M LiuFull Text:PDF
GTID:2178360305971650Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
With the rapid development of Web technology and digital library, the network has been an important information sources to most people. The information in network has given most scientists efficiency of communication of science. On the other hand, it has given the chances to plagiarism or abuse information illegally. The technology of copy detection has been raised to prevent illegal copy and spread of digital documents, which is used in intellectual property protection and information retrieval. Copy detection is to judge whether the given document plagiarizes contents of other documents in database, which plagiarism occurs in some ways, such as by duplicating partial or total document contents and using different words and sentences to express the same meaning of the texts of pervious documents in the database. Copy detection includes the two categories: the plagiarism of source code and the plagiarism of natural language.We need draw program or document's characteristic value which represent the basic language units of content of the program or document at first, and then compare them, judge the similar intensity among the program code or document according to the comparative result, namely calculate similar degree. According to the result of similarity, we can determine whether they plagiarize each other.Firstly, this paper introduces background, basic concept, research situation and scientific significance of the copy detection. Then it analyses the key points of the current systems, such as system structure, text representation in compute etc, and then it explores the technologies and the characteristics of the related technologies to build a system.Secondly, the paper designs the copy detection system's architecture, it based on B/S, a kind of multi-layer structure, ASP.NET and database technology, in points of realizing, it make use of ADO.NET package of ASP.NET to connect the webpage to the database, make use of match of the word section to realize the log-in, detect the upload documents based on the methods of string matching.Thirdly, it describes each function module realization of the system in detail. Finally, the experiments have proved the practicality and the validity of the system, summarized the characteristics and inadequate of the system, and it also put forward the future development and application.
Keywords/Search Tags:copy detection, similarity, source code, natural language
PDF Full Text Request
Related items