Font Size: a A A

Research And Implementation Of Homework Duplication Checking System Based On Hybrid Detection Strategy

Posted on:2023-10-25Degree:MasterType:Thesis
Country:ChinaCandidate:B H HuFull Text:PDF
GTID:2557306830452734Subject:Computer technology
Abstract/Summary:PDF Full Text Request
For academic plagiarism,many institutions and researchers are committed to research tools to detect plagiarism of academic content.The most famous academic plagiarism detection systems include English duplicate checking system Turnitin and Chinese academic misconduct paper detection system.These systems find out the content suspected of plagiarism by comparing the academic documents submitted by users with the document library of the system.Homework plagiarism in college classroom is one of the forms of academic plagiarism.The sources of homework plagiarism of students in college classroom mainly come from other students’ homework and the Internet,and a few come from scientific research papers.It is of great practical significance to study the duplication checking technology for homework plagiarism among students in class.Facing the actual demand of homework duplication checking in college computer course teaching,this paper investigates the existing research on solving the problem of text plagiarism detection in the market and academia,analyzes the characteristics of college computer course homework,designs and implements an homework duplication checking system based on hybrid detection strategy.The work contents and main innovations of this paper are as follows:(1)This paper analyzes the teaching needs and operation characteristics of computer classroom in colleges and universities.According to the characteristics of operation content,the computer classroom operation in Colleges and universities is divided into experimental report type operation and subjective and objective problem type operation.A detection strategy based on hybrid detection strategy is proposed,and the corresponding duplicate detection algorithm is developed for the two types of work.(2)Aiming at the duplication checking problem of experimental report type homework in college computer courses,a document similarity calculation algorithm based on mixed pattern feature matching is proposed.By extracting the fingerprint feature of the text content and the perceptual hash feature of the picture content of the experimental report type homework,the text similarity and picture similarity between experimental reports are calculated respectively,The similarity of experimental report is obtained by weighted calculation of text similarity and image similarity.In order to verify the correctness of the method,this paper investigates the text length characteristics of the computer course experiment report in a university,uses the text plagiarism detection and evaluation corpus generation method to simulate the real teaching scene to generate plagiarism detection and evaluation corpus,and uses the generated plagiarism detection and evaluation corpus to verify the correctness of the proposed text similarity calculation method.(3)Aiming at the problem of homework duplication checking of subjective and objective types in computer courses in colleges and universities,a double-layer matching model based on grammatical features and semantic features is proposed.This paper uses a fast sentence matching method based on the calculation of grammatical feature similarity to find the set of sentence pairs that may be similar between two subjective questions.The text depth feature extraction based on Bert and the feature matching model based on bi-LSTM are used to match the set of possible similar sentences,and the text similarity is calculated by counting the number of semantically similar sentences in the text.In order to verify the correctness of the proposed method,this paper uses Chinese lcqmc data set,English PAN-2011 interpretation data set and English MSRP data set to verify the method.(4)A homework duplication checking system is designed and implemented.The system is equipped with the algorithm model proposed in this paper.The system is combined with sunflower online learning platform developed by Guangdong Key Laboratory of computer network to provide teachers with online homework duplication checking function.In the practical application scenario,the system proposed in this paper is tested on two courses(algorithm design and analysis,high-performance computing)in the school of computer science and engineering of a university.In a total of 434 assignments,5 pairs of completely copied assignments and 6 pairs of suspected copied assignments are found.
Keywords/Search Tags:Plagiarism detection, Pattern matching, Natural language processing, Text similarity calculation
PDF Full Text Request
Related items