Research And Design Of The English Essay Similarity Detection System For Chinese College Students

Posted on:2018-04-09

Degree:Master

Type:Thesis

Country:China

Candidate:X Y Wang

Full Text:PDF

GTID:2335330515996089

Subject:Software engineering

Abstract/Summary:

PDF Full Text Request

With the development of natural language technology,more and more colleges seek to use technology facilities to improve their teaching efficiency.In such condition,automatic grading technology of English composition appears.In china,there has been a number of automatic grading systems,but the similarity detection algorithm in these systems is superficial.In foreign countries,researches about similarity detection mainly focus on long texts such as papers and codes.Therefore,the goal of this paper is to improve the similarity detection algorithm and develop a more suitable similarity detection system for colleges.In order to achieve this goal,firstly this paper conduct a research on the characteristics of English composition.Secondly,according to these characteristics,English compositions are divided into different types.For the long compositions that have 60 or more than 60 words,the paper designs a similarity detection algorithm based on WordNet semantic clustering,improving the TCUSS clustering algorithm.For the short compositions that have less than 60 words,the paper designs a similarity detection algorithm based on stop words,after verifying the stability of stop words in English.Thirdly,this paper collects a number of corpus samples.The results of the two algorithms and the whole similarity detection algorithm are verified by these samples.After comparing these results with the result of K-means algorithm,we come to the conclusion that the new algorithm we design is superior to the K-means algorithm.Finally,based on the new algorithm,this paper designs the similarity detection subsystem in the computer-aided review system.The paper presents a similarity detection algorithm of English composition.After verification,the correct rate of the whole algorithm,the recall rate and the degree of F1 are better than that of the commonly used similarity detection algorithm.Furthermore,the paper takes an asynchronous approach to design the subsystem,and in this way,the computer-aided review system can meet the demand of large-scale use.

Keywords/Search Tags:

Composition Scoring, Similarity Detection, Stop Words, Text Clustering, Semantic Information

PDF Full Text Request

Related items

1	Research And Design Of The English Essay Off-topic Detection System For Chinese College Students
2	The Implementation Of Plagiarism Detection Model For English Essay
3	Textual Similarity Detection
4	A Study On The Automatic Scoring Method For Chinese-English Interpreting Questions In Terms Of The Key Information Of Answers
5	Research On Automatic Scoring Model Of Chinese-English Interpretation Based On Semantic Scoring
6	An Experimental Study Of Effects Of Word Clustering On Senior High School English Vocabulary Learning
7	The Effects Of Word Clustering Presentation Modes On Vocabulary Retention Of Senior High School Students
8	The Effects Of Semantic Clustering& Thematic Clustering On Multi-dimensional Vocabulary Knowledge Acquisition Of Senior Middle School Students
9	Automatic Identification And Extraction Of English Verb Patterns: A Study Based On The Clustering Of Concordances
10	Semantic edge detection: Intra- and inter-hemispheric processing of semantically ambiguous words