| Automated Essay Scoring(AES)is a typical application of Natural Language Processing(NLP)in education.Compared with manual marking,automatic marking technology saves time and effort,so its development is closely concerned by the education industry.At present,the research on automatic essay marking can be divided into two types according to different scenarios: the same prompt essay marking and the cross-prompt essay marking.Among them,the cross-prompt essay scoring is more in line with the needs of realistic scenarios because it is based on the essay training model with limited prompt to score new essays.However,there are great differences between training essays and test essays in prompt,difficulty and scoring standard,so it is more difficult to construct cross-prompt scoring model.Based on the previous research on automatic marking of cross-prompt composition,this paper carries out two aspects of work: 1.To solve the problem that the current cross-prompt essay scoring model is not sensitive to new prompt information,a cross-prompt essay digression detection algorithm is proposed.2.Aiming at the problem that source data is not sufficiently extracted from crossprompt model,a cross-prompt essay automatic scoring model based on mixed features is proposed.Specifically,the main work of this paper is as follows:1.A cross-prompt digression detection algorithm based on essay prompt and essay paragraph is proposed.At present,most of the researches on automatic scoring of crossprompt composition only focus on extracting the semantic information of the composition,but ignore the information of the fragments of the composition content and the information of the composition goal.This paper proposes a new digression detection algorithm based on essay goal prompt and essay fragment.Firstly,the model transfer theory is used to construct the paragraph segmentation model,and the essay is represented by sentence vectorization,and the essay prompt content is also represented by vectorization.Finally,a score calculation algorithm is designed to calculate the essay paragraph and the essay prompt vector together to get the score of the essay digression detection.In this paper,Pearson coefficient is used to measure the correlation between the essay goal prompt and the essay to be tested.Through multi-data sets and multi-model comparison experiments,it is proved that the digression detection algorithm proposed in this paper can effectively judge the degree of digression of the essay.2.A cross-prompt automatic essay scoring model based on mixed features is proposed.Achieving excellent sharing features between source and target topic essays is a key factor in improving the effectiveness of cross-topic AES scores.The existing cross-topic AES mainly uses two methods to acquire shared features: one is to manually count the shallow text features unrelated to the subject content by making rules;the other is to extract text features directly by using deep learning method.The former has a strong pertinence due to established rules,but it is difficult to construct some abstract features such as coherence and text structure,while the latter has a strong ability to learn complex features,but is extremely dependent on excellent training data and lacks interpretation.In this paper,the manual method is combined with the deep learning method to connect the shared shallow text features and some thematic independent deep features to maximize the extraction of shared features.In addition,based on this,a two-stage cross prompt essay automatic grading model is designed to realize cross prompt essay automatic grading.Finally,an experiment was conducted on the ASAP competition essay data set.By comparing the experiment with the baseline model,the shared feature extraction method proposed in this paper is proved to be effective in improving the scoring effect of the crossprompt essay automatic scoring model. |