| With the rapid rise of the Internet industry,the society is now transforming from the information age to the era of big data.As a campus card system that integrates a large number of student campus behavior data,it has been widely used in the development of university informatization.It has been bringing great convenience to students,and accumulating a large amount of student campus behavior flow data.Student academic performance is an important indicator for measuring the quality of school teaching.It is very important for the growth and development of students and for teachers to check teaching results.Mining the information behind campus card data,analyzing the underlying laws between student behavior and performance,has become an increasingly important research content for universities and researchers.Therefore,based on student campus card data,the research collects basic attributes,consumption habits,book borrowing and other daily life data of college students.By preprocessing the above data,the attribute characteristics used for the experiment are screened out,student user portraits are constructed,and the relationship between student daily behavior and performance is displayed to assist college teaching managers in scientific management and decision-making.The main work of the research is divided into the following aspects:(1)Construct user portraits and grade predictions.According to the prominent time and place data characteristics in the basic data,a sparse spatial-temporal feature processing method is proposed.Based on the prominent time and place data characteristics of students’ campus behavior data,time and space feature processing is performed to refine the rules of time and consumption,and the data sparseness operation is introduced.Building the dataset by adding concepts of breakfast and bath activity and consumption patterns.From the perspective of weighted Euclidean distance and the optimization of the k value selection of the PCA-K-means algorithm,the Sparse-spatial-temporal clustering method based on Distance PCA-K-means(STBDK)is proposed.The behaviors of students in school are clustered into four types,and the group characteristics of four different types of students are analyzed.Then,the students’ data in the test set are used to predict their performance,and compared with the unmodified PCA-K-means algorithm and Euclidean-K-means algorithm.The experimental results show that the STBDK algorithm results are more accurate than the other two model results,verifying the scientific rationality of the algorithm.In the data processing process,a sparse spatial-temporal feature processing method is proposed.(2)Analysis of student behavior in school and scholarship prediction.The student scholarship prediction problem is abstracted as a binary classification problem in this study.When constructing a student feature data set,feature extraction from student user portraits is mainly performed from the perspective of student campus life.Based on the random forest optimization feature selection,the Gini index is used to analyze the importance of the features and then enter the logistic regression model,a Evaluating the importance of and prediction Student Performance based on Random Forest and Logistic Regression(EIPRF-LR)algorithm is proposed.It analyzes whether the student’s living behavior habits affect students’ learning to obtain scholarships.Compared with the basic algorithm LR model,the accuracy of the EIPRF-LR model has been improved. |