Font Size: a A A

Payoff Control In Stochastic Games

Posted on:2020-01-25Degree:MasterType:Thesis
Country:ChinaCandidate:Y LuFull Text:PDF
GTID:2370330596475059Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
In many research areas,seeking the optimal strategy for multi-agents in long-term relationship is one of the most fundamental tasks.Repeated game,as the mainstream theoretical framework used for modeling and analyzing the long-term interaction of multiagents,has been widely studied and applied in artificial intelligence,economics and biology.Previous studies have shown that,due to the outcomes are jointly determined by all players,it is especially difficult for a player to unilaterally control the payoffs.However,according to the Zero-Determinant strategy,players’ capability of affecting payoffs unilaterally could be underestimated.In repeated games with invariant environment,the payoff control strategy can unilaterally determine the linear relationship between the long-term reward of both players.This discovery breaks the idea that payoff control is difficult to achieve in traditional game theory.It has an cross-era significance in the study of individuals’ relationship and strategy-payoff relationship.Subsequently,this strategy was certificated and improved by many researchers.It is extended to the imperfect information games,which is more common condition including information noise.However,the above-mentioned promotion research is still based on the repeated game of the invariant environment,that is,it is assumed that each stage of the repeated game is in the same environmental state.This thesis proposes an algorithm framework to realize mutual revenue control under stochastic game environment.Since random environment variables are introduced to the game,the scene is closer to the real world.In this complicated scene,a more generalized revenue control strategy algorithm is proposed.The feasible strategy based on the algorithm framework can not only realize the income control of random game,but also have excellent control characteristics in the face of complex and changeable environmental scenarios.Through the research of this dissertation,we first put forward the payoff control strategy algorithm under the complex scenes of stochastic games from the theoretical point of view.In this part,the rationality and effectiveness of the algorithm is derived and verified.As a result,the payoff control strategy in the stochastic games is revealed.Secondly,this thesis conducts simulation experiments in large-scale game and comes to some interesting conclusions.Compared with other classical strategies,zero-determinant strategy does not have a very good performance in terms of dominance and evolutionary stability.However,the performance of the subsets in the zero-determinant strategy reveals that winning in a single game is not omnipotent.And the extortion strategy,as a subset of Zero-Determinant strategy,breaks the equilibrium of mutual betrayal in the Iterated Prisoner’s Dilemma,and provides an effective solution for the individuals involved in the game to achieve mutual benefits and promotes the degree of cooperation.Finally,we simulate the game between the extortionist and the reinforcement learning agent.This simulation reveals that the Zero-Determinant strategy can still achieve unilateral payoff control when it faces the agent that can dynamically optimize his strategy.
Keywords/Search Tags:Multi-agent system, optimizing decision, Stochastic Games, ZeroDeterminant strategy
PDF Full Text Request
Related items