| With the application of electronic teaching resources,educational software,Internet in education and the establishment of a large number of student information database,educational data mining technology comes into being.Educational data mining technology can mine useful information from massive educational data and provide guidance for educators and educatees.Due to privacy protection and other issues,the field of educational data mining is often unable to obtain enough effective annotated data,in this case,transfer learning has been widely concerned in the field of educational data mining.However,the existing educational data mining methods based on transfer learning are limited to the study of the migration between homogeneous educational data,that is,the acquired source domain data and target domain data features are in the same space.When it comes to the transfer problem of heterogeneous educational data,isomorphic transfer learning is no longer applicable,so the use of heterogeneous transfer learning method can effectively solve the problem of knowledge transfer between heterogeneous educational data.As the most concerned problem in the field of educational data mining,the study of heterogeneous transfer learning is more meaningful.By predicting students’ performance accurately and timely,students’ learning rules and mastery of knowledge points can be extracted to help educators optimize teaching methods and improve teaching quality.Based on the problem of predicting student performance in the field of educational data mining,this paper proposes a heterogeneous transfer learning method to predict student performance.The main work and innovation are as follows:1.Aiming at the problem of feature heterogeneity,based on the correlation between features and tags in the label data of source domain and target domain,based on Spearman correlation coefficient,the features of source domain and target domain are reordered for feature selection and matching.At this point,the data in source domain and target domain after feature matching can be regarded as isomorphic data,and the logistic regression algorithm can be used for training on the source domain to predict the unlabeled data in target domain.2.In view of the difference in feature distribution between source domain and target domain,in order to further reduce the difference in feature distribution between source domain and target domain on the basis of feature matching,the logistic regression model trained with source domain data is continued to use target domain label data for training.In this case,the logistic regression model can fully obtain the information contained in source domain and target domain.Narrow the feature distribution gap between source domain and target domain.3.This paper uses the large-scale open source edX platform education data set as the source domain and Jilin University course data set as the target domain for experimental evaluation.The final results show that compared with the traditional machine learning algorithm that only uses target domain data training and the mainstream isomorphic and heterogeneous transfer learning algorithms,the proposed algorithm can predict students’ learning performance more accurately. |