| Since online education has boomed with the integration of education and information technology in recent years,the number of relevant test questions has greatly expanded.Efficient organization and management of these test resources,effective test recommendation,rapid paper formation,adaptive testing,and other intelligent process have become the research focus in this field.Automatic labeling of knowledge points is the basis of managing test data and improving the automation and intelligence of education.At present,there are few kinds of research on the automatic labeling of knowledge points in mathematics test questions.Compared with ordinary texts,they contain special elements such as symbols and formulas,with complex structure and semantics.If the text classification technology in the general field is directly applied,it is difficult to meet the accuracy requirements of knowledge point prediction.Therefore,this thesis takes K12 mathematics questions as the research object,based on the theory and technology of text classification in general domain,combined with the particularity of mathematics text,it studies the key problems in automatic labeling of knowledge points in mathematics questions,which includes the parsing and embedding of the mathematical formula,feature extraction and representation of test questions,and the construction of classification model based on label semantics and multi-label smoothing,the details are as follows:(1)Vector semantic model is used to represent words in mathematical test questions.Except for the Chinese words,the text also contains mathematical symbols and formulas.Mathematical formulas are different from ordinary natural language texts in structure and grammar,so the pre-training embedding of each formula can be obtained through formula parsing and learning.(2)In the aspect of text representation and feature extraction of mathematical test questions,it uses a neural network to automatically extract features.The knowledge points of test questions have specific semantics and a corresponding relationship with test texts,so it introduces label semantic attention to guide the neural network model,so as to extract important information from test texts and improve the classification effect.(3)In the research of the mathematical text classification model,the research goal is to build a multi-label classifier with high robustness and strong generalization ability.In this regard,it uses the multi-label smoothing technology of fusion text features to modify the loss function,improve the prediction ability of the model for new data,and further improve the classification effect.(4)It trains the model on the data set of high school mathematics questions and verifies its validity through several control experiments.Experimental results show that the model designed in this thesis is feasible and effective,which can achieve a better classification effect by introducing label attention and integrating text features for multi-label smoothing processing. |