| In online learning scenarios,engagement is an important indicator to measure learning experience.Improving the accuracy of engagement recognition can help teachers get feedback on the course in time,thus improving students’ learning experience.For the research on students’ engagement recognition,the more mainstream method is based on video data.The previous method based on questionnaire survey and biosensor has problems such as poor reliability and expensive equipment cost.In recent years,deep learning technology has been gradually applied to the field of engagement recognition and achieved many achievements,but there are still some shortcomings.For example,the feature-based attentiveness recognition method only uses a few features and does not comprehensively consider the behavioral and emotional information related to attentiveness.The engagement recognition method based on end-to-end model has some problems,such as complex model and poor real-time performance.In view of the above problems,the main research work of this thesis is as follows:(1)Firstly,the extraction of learner’s behavioral and emotional features is studied.The methods of extracting learner’s behavioral features based on Open Face and emotional features based on deep residual attention network are proposed,and the Pearson correlation coefficient between learner’s behavioral and emotional features and the level of engagement is studied,as well as the statistical differences of these characteristics at different levels of engagement.The results show that the combination of the learner’s behavioral and emotional features is reasonable for the task of engagement recognition,which paves the way for subsequent methods of engagement recognition.(2)A engagement recognition approaches is presented that fuses features.Then,the feature vectors that fuse the learner’s behavior and emotional information are input into the time series model to identify them.On the aspect of time series model,based on the Temporal Convolutional Network(TCN),which is good at dealing with space-time data,this thesis presents a Spatioltemporal Attention Temporal Convolution Network(SA-TCN),which enables the model to extract important time and space features adaptively and ignores redundancy features,thus improving the accuracy of engagement recognition.The algorithm is trained and validated on the public dedication recognition dataset DAISEE.The experimental results show that the proposed algorithm based on fused features achieves 62.5%accuracy in the fourth classification of the engagement level,which is superior to other previous dedication recognition algorithms.It also improves the recognition of a few classes of samples,And validated on Emoti W-EP data sets,achieved 0.0708 MSE.(3)Considering that there is higher requirement for the speed of engagement recognition in practical application,starting with single-stage 3-D convolution network,a channel hybrid I3 D engagement recognition network is proposed,which reduces the model parameters by replacing the convolution core of the Inception module in the original I3 D.Secondly,the lightweight network Shuffle Net is fused into the improved I3 D network,and a multi-category loss focal loss is introduced to improve the model’s ability to monitor the overall category loss.The experimental results show that the I3D-CFShuffle Net engagement recognition network proposed in this thesis not only improves the performance of the original I3 D network,but also shortens the reasoning time of the model and maintains a high recognition accuracy compared with other methods on DAISEE.The algorithm is evaluated from both accuracy and speed,and it has certain application value. |