| In response to the high dropout rate of Massive Open Online Courses(MOOC),many studies have been conducted on MOOC dropout prediction.These studies aim to build models that can predict potential dropouts and enable timely intervention to reduce the dropout rate.However,these studies still have some problems.Firstly,they heavily rely on input data,often requiring more feature engineering and human intervention to obtain better results,and lack consideration for high-dimensional feature extraction and timing.The second is the dropout prediction model based on Long Short Term Memory(LSTM)and Gate Recurrent Unit(GRU).With the increase of sequence length,gradient disappears,which makes it difficult to capture the long-term dependence in long sequence,and is not suitable for long sequence prediction.Therefore,it is necessary to study new methods for predicting dropout.This paper proposes a CNN-BiGRU-ATT dropout prediction model to predict the dropout situation of learners in the next week.Automatically extract high-dimensional features that affect dropout through Convolutional Neural Networks(CNN),reducing dependence on feature engineering;Using Bidirectional Gate Recurrent Unit(BiGRU)to capture dependencies in time series improves computational efficiency compared to existing models based on LSTM;Afterwards,a feedforward attention mechanism was introduced to increase the weight of important features and output the probability of dropout in the next week at the fully connected layer.Using the Xuetang X dataset,compared to the GRU model,the model has improved accuracy,accuracy,recall,and F1-score indicators by approximately 3%,9%,4%,and 7%,respectively.Compared to advanced models,it has improved accuracy and F1-score indicators by about 2% and 0.5%,respectively.This paper builds a MOOC dropout prediction model based on Informer,which solves the problem of long sequence prediction in MOOC dropout prediction.Informer solves the problem of memory degradation caused by long time series using superior mechanisms such as Multi-head Prob Sparse Self-attention and generative decoder,better capturing long-term dependencies in the sequence.This paper analyzed and processed the data,refined the feature statistical granularity to days,set prediction targets for the next 7,14,and 21 days,and conducted comparative experiments.Using the Xuetang X dataset,for dropout prediction in the next 21 days,the MAE and RMSE values of the Informer based dropout prediction model are about 10% and 2% lower than the CNN-BiGRU-ATT model,respectively.In addition,most existing literature only conducts research on MOOC dropout prediction based on click stream data,with a single data dimension.This paper constructs an initial feature set containing 52 features based on multi-dimensional click stream data,user information,and course information,providing stronger support for research. |