Font Size: a A A

Multi-Modal Pre-Trained Models For Difficulty Prediction And Knowledge Tracing Of Programming Problems

Posted on:2024-02-29Degree:MasterType:Thesis
Country:ChinaCandidate:Z Y WangFull Text:PDF
GTID:2568307067494484Subject:Electronic information
Abstract/Summary:PDF Full Text Request
With the growing demand for programming skills across various industries,students often turn to online programming platforms for practice and competitions.However,there is relatively little research on digital intelligent education for programming questions.This paper focuses on the tasks of difficulty prediction and knowledge tracking for programming problems and explores them in a data-driven manner.For the difficulty prediction task,existing methods have not considered the multimodal information of programming problems,particularly the lack of modeling of code solution.For the knowledge tracking task,existing methods have not taken into account the multi-factor difficulty information of problems and the multi-knowledge-point answering of exercises.Therefore,this paper studies the problems of the two sub-tasks separately and proposes solutions for each.In the first task,a new task is proposed to solve the difficulty prediction task in a data-driven manner.Specifically,this task is viewed as an innovative multimodal understanding problem.The difficulty prediction task is addressed by simultaneously modeling exercise text information and code solution information.By solving this task,it is possible to obtain a more objective evaluation of problem difficulty without relying on the accumulation of student solutions.The proposed model,C-BERT,is inspired by recent large-scale pre-training models and uses BERT and Code BERT to model problem text and exercise solution,respectively.To further improve the model’s performance,cross-modal CLS representation is used to fuse BERT and Code BERT across modalities,capturing the interaction between the two modes.In addition,the model fine-tunes Code BERT using code type information to improve its performance.Experiments on two open-source datasets based on the real POJ website demonstrate that the C-BERT model has advantages over other programming problem difficulty prediction methods.The second task is inspired by the first task and believes that the previous research has overlooked the multifactor difficulty information and multimodal heterogeneous data in the knowledge tracking of programming questions.In addition,this paper also finds that there are multiple ways and knowledge points to solve programming questions,which may result in students learning different knowledge.To solve these problems,this paper proposes a knowledge tracking method called DPKT,which constructs multifactor difficulty information based on the multimodal data of programming questions to assist in the knowledge tracking process.Specifically,this paper introduces multiple multifactor difficulty information from four parts: subjective difficulty judgment,knowledge level improvement,code-based knowledge acquisition filtering,and knowledge state update.The module of knowledge acquisition filtering based on the solution code of the question is aimed at updating the knowledge used by students’ codes.Finally,experiments conducted on two public datasets show that the proposed DPKT method has significantly better performance than other baseline models.
Keywords/Search Tags:programming problem, multimodal representation learning, difficulty prediction, knowledge tracing, Pre-trained language model
PDF Full Text Request
Related items