Research On The Item-doing Incentive Mechanism Based On Reinforcement Learning For Students

Posted on:2022-11-16

Degree:Master

Type:Thesis

Country:China

Candidate:S D Niu

Full Text:PDF

GTID:2480306764480434

Subject:Automation Technology

Abstract/Summary:

PDF Full Text Request

With the rapid development of information and Internet technology,more and more students acquire knowledge and consolidate their learning through learning online.Itemdoing by students is an effective means to test and improve students’ learning effects.Giving a certain degree of reward can motivate students’ enthusiasm for doing items.In this thesis,the process of students doing items and scoring is regarded as a Markov Decision Process（MDP）.By studying the setting of the reward function of MDP,the scoring rewards of students in the process of doing items are designed.The main work done in this thesis includes three aspects as following:（1）Analyze the two objective factors that affect students’ scoring in the process of doing items: the difficulty of the items and the time spent on each item,and design the initial reward function;Introduce the subjective factor named students’ confidence index in doing items,which represents the students’ confidence degree in doing an item right,and design a proper scoring rule--"logarithmic" scoring rule to ensure that students submit their confidence index that meets their true level;Based on the difficulty of the items,the time spent on each item and the confidence index,a set of a scoring scheme that motivates students to do items is proposed.（2）In order to speed up the convergence speed of learning and ensure the invariance of the optimal policy of reinforcement learning,a reward shaping scheme based on dynamic potential function is proposed,and through deriving theoretical formulas,this thesis proves that the scheme can guarantee the invariance of the optimal policy,as well as the equivalent relationship between the shaping function based on dynamic potential and the initial reward function;Five groups of students’ item-doing reward schemes,including the scheme proposed in this thesis and four other classical reinforcement learning reward schemes,are designed to do simulation experiments.By comparing the optimal policy and the average steps at convergence obtained under each scheme,the effectiveness of the proposed scheme is proved.（3）Design and develop a student online item-doing system,conducted demand analysis,overall architecture design,and detailed design of several key modules and database tables for the system.The main functions implemented include online assessment of students and acquiring scores with the reward scheme proposed in this article,answer sheet analysis,and basic item management,knowledge point management and user information management functions.In this thesis,reinforcement learning and reward shaping technology are applied to the score reward design for students to do items online,which realizes the effect of testing students’ learning achievements and giving them personalized score rewards by doing items,and provides an opportunity for encouraging students to actively participate in online learning and doing items.Besides,the idea of this thesis plays a good role in promoting the realization of individualized education for students.

Keywords/Search Tags:

Reinforcement Learning, Item-doing Rewards for Student, Confidence Index, Reward Shaping Based on Dynamic Potential Function

PDF Full Text Request

Related items

1	Research And Application Of Reward Strategies For Reinforcement Learning In Incomplete Information Games
2	Research On AUV Obstacle Avoidance Method Based On Reinforcement Learning
3	On The Theory Of Single-task And Multitask Reward-free Reinforcement Learning Under Low-rank MDPs
4	Optimal Control Of Discrete-Time Systems:Average-Reward-Based Reinforcement Learning Methods
5	Research On Magnetic Anomaly Target Tracking Algorithm Based On Reinforcement Learning
6	Research On Autonomous Driving Human-like Car-following Decision Algorithm Based On Deep Reinforcement Learning
7	A Study Of Influence Maximization Based On Reinforcement Learning
8	Research On Stock Index Prediction Model Based On Complex Network And Reinforcement Learning
9	The Empirical MAB Strategies With Markov Structure
10	Active Learning,Technology,and Student-to-Student Connectedness:Examining Technology Enabled Active Learning Classsrooms In Chinese Higher Education