Font Size: a A A

Research On Complex Games Based On Deep Reinforcement Learnin

Posted on:2024-02-17Degree:MasterType:Thesis
Country:ChinaCandidate:Y F RuiFull Text:PDF
GTID:2530307106481874Subject:Software engineering
Abstract/Summary:PDF Full Text Request
Recent years have witnessed huge success artificial intelligence has achieved in games,such as Alpha Go and Pluribus.Unfortunately,the existing game approaches are hard to be directly and successfully applied to complex games,due to several main reasons: huge action space,sparse reward and cooperation among gamers.In this thesis,approaches based on Deep Reinforcement Learning are introduced to solve these challenges: Deep Apprenticeship Learning,splitting the composite action space and team value decomposition.Dou Di Zhu,a traditional Chinese poker game,is used as the platform to verify the approaches this thesis proposes.The specific work is as follows:First,in order to solve the problem of sparse reward,Deep Apprenticeship Learning is adopted.Through the introduction of knowledge distillation and transfer learning,problems Deep Apprenticeship Learning encounters when it is introduced into complex games have been handled.As to deal with the limited stochasticity of expert’s data and continuously improve the effectiveness of the reward function from Deep Apprenticeship Learning,the self-learning model is proposed,which integrates process of Deep Apprenticeship Learning and Deep Reinforcement Learning through the scheduled shared experience buffer.Second,in order to deal with the problem of huge action space,a method to simplify action space based on splitting is proposed.One action space of Dou Di Zhu game with 27472 elements is split into the action space of top-level with 309 elements and the action space of bottom-level with 9037 elements.Based on the split action spaces,a hierarchical network is built which changes the decision-making process but implements the same task of the game with simpler action spaces.Third,in order to realize the cooperation among the gamers,this thesis proposes dynamic cooperation mechanism based on Deep Recurrent Value Decomposition Networks.Taking the final results of peasants as joint rewards,and decompose the joint rewards into each peasant’s own value network.By optimizing one’s own value network,the benefit of the team will be maximized and the cooperation between peasants will be realized.The experiments verify that the approaches this thesis proposes have effectively solved the problem of sparse reward,huge action space and multi-agent cooperation in complex games and achieved better game results.
Keywords/Search Tags:Deep Reinforcement Learning, Deep Apprenticeship Learning, Computer Game, Multi-agent Cooperation
PDF Full Text Request
Related items