Font Size: a A A

Automatic Scoring Algorithms For English Translations Based On Semantic Understanding

Posted on:2023-06-28Degree:MasterType:Thesis
Country:ChinaCandidate:G LiFull Text:PDF
GTID:2555307103985209Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In various English exams,English translation is an important question type.Manual grading for this type of question requires a lot of human resources,and other factors such as the scorer’s fatigue and mood also affect the grading results.The English translation automatic scoring can reduce the scoring differences caused by the subjective factors of the scorers,improve the fairness of the scores,and reduce the workload of the scorers,which has high practical application value.Therefore,this paper studies the automatic scoring algorithm for English translation.The current research on automatic English translation scoring can be divided into three categories.One is based on vocabulary matching,but there are limitations in word meaning,structure and knowledge.The second is the shallow semantic method,which uses shallow semantic techniques such as synonyms for scoring,but cannot completely replace the lexical method,and does not consider the deep semantics of sentences.The third is the deep semantic method,but the extracted semantic information is not rich enough,and the misspelled words will affect the model’s extraction of semantics.On the basis of the existing research,this paper combines the scoring criteria given by the Ministry of Education for the CET-4 and CET-6translation questions,and automatically scores the translations from the two aspects of vocabulary and semantics.In terms of vocabulary,the current error correction methods are difficult to ensure comprehensiveness and no ambiguity,and also have shortcomings such as low search efficiency and lack of error correction at the semantic level.This paper proposes a semantic-based English spelling error correction method.Use edit distance to calculate the correct word set of misspelled words,use Word Net to expand the synonyms of the words in the reference model text,use the expanded word set as the filter data set of edit distance,filter out irrelevant and incorrect words,and then use the bigram language model calculates the probability that each word in the filtered word set is a corrected word,selects the maximum probability as the final corrected word,and uses a smaller and faster storage method to replace the traditional vocabulary storage method.Experiments show that,compared with traditional error correction methods,English spelling correction based on semantics has a good effect,and also reduces the consumption of physical resources to a certain extent.In terms of semantics,in view of the problems that existing models tend to lose semantic focus and ignore sentence structure information,a semantic computing model that integrates multi-angle features is proposed.The model is based on the Siamese Network,and uses the BERT model to generate word vectors for word similarity fusion to strengthen semantic features,and uses Bi-LSTM to encode the sentence structure features of the input text,that is,the sentence structure information of the text part-of-speech sequence.The Transformer encoder performs multi-level interaction between the text sentence structure features and text features,and finally calculates the semantic similarity between texts by splicing vector inference.Experiments on some Quora datasets show that this model has better performance than the classic model good performance.A complete English translation automatic scoring method is constructed by using the above algorithm,which has a good effect compared with the existing English translation scoring model.Finally,the shortcomings of this paper and the direction of future optimization are summarized.
Keywords/Search Tags:English Translation, Automatic scoring, English error correction, Transformer, Semantic similarity
PDF Full Text Request
Related items