| With the application of artificial intelligence and Internet technology,online education has achieved rapid development.At present,the online education examination system can achieve accurate automatic scoring for objective questions such as multiple-choice questions and fill-in-the-blank questions,but the scoring of subjective questions can still only be manually scored by teachers.With the continuous expansion of the scale of online education examinations,teachers have become more and more burdened to mark exams,which is not conducive to the development of online education and teaching.Therefore,it is of great significance to study the related technology of automatic scoring of subjective questions.Text similarity calculation is a key technology in the automatic scoring of subjective questions.This article studies and improves several similarity algorithms suitable for automatic scoring of subjective questions.Finally,through comparative experiments,the algorithm with the highest accuracy is applied to the automatic scoring of subjective questions.In the system.The data set used in the experiment comes from the short answer questions of the final exam of the software engineering course of Southwest University Network Education College.The main research contents of this paper are as follows:1.Aiming at the shortcomings of the average weighted word vector(Word2Vec Average,WA)representing the text vector,the WA-PL algorithm is obtained by adding the part of speech feature(Part of speech)and the text length(Length)feature of the text on the basis of the WA method.Experimental results show that the improved WA-PL algorithm has lower average absolute error than WA.2.In order to obtain text semantic information more efficiently,TF-IDF or Text Rank keyword extraction is added to the WA-PL algorithm to obtain TF-WA-PL and TR-WA-PL algorithms.Experimental tests show that both TF-WA-PL and TR-WA-PL have higher accuracy rates than WA-PL.Among them,TR-WA-PL has a higher accuracy rate than TF-WA-PL,which shows that Text Rank extracts Keywords can represent text semantics better than TF-IDF.3.This article also researches the SIF(Smooth Inverse Frequency)algorithm,which is to weight the vector of words in the text with the inverse smoothing frequency and then average them,and then subtract the projection of the first principal component to get the vector of the text.In this paper,the cosine similarity calculation is performed based on the student answer text vector and the standard answer text vector obtained by the SIF method,and combined with the sentence length similarity weighting to obtain the SIF-WL(SIF-Word2 Vec Length)algorithm.Experimental tests show that the algorithm has the smallest average absolute error and the highest accuracy rate in the experiment,and the algorithm’s scoring results are closest to manual scoring,which shows that the text vector obtained by the algorithm can more accurately express the text semantics.So I chose to apply the SIF-WL algorithm to the automatic scoring system for subjective questions in this article.In addition to the above research,this paper also developed an automatic scoring system for subjective questions based on the above algorithm.The system mainly includes registration and login module,teacher module and student module.Teachers’ functions mainly include question bank management,test paper management,test management,test paper scoring,and class management.The functions of students mainly include taking exams,checking test scores,and checking the answers to test questions.After the student’s exam is completed,the teacher uses the automatic grading function to automatically grade all the test questions.The scoring method is a combination of the SIF-WL algorithm and the plagiarism detection algorithm.If a student is found to be plagiarized,a certain score will be subtracted from the SIF-WL algorithm score.If the automatic scoring algorithm is not appropriate,the teacher can correct the scoring results.Finally,the system test shows that the system has a good scoring effect on the specified subjects,which provides a certain reference value for future related research. |