Discrete Time Markov Decision Processes Based On Variance Constraint

Posted on:2022-04-02

Degree:Master

Type:Thesis

Country:China

Candidate:H T Lin

Full Text:PDF

GTID:2480306734465684

Subject:Science

Abstract/Summary:

In this paper,we study the discrete time discount Markov decision processes with the state space being countable space,the action space being Borel space,and the reward function being non-negative variance constrained.The goal is to find a policy that maximizes the expected discounted total reward in a countable state space when the variance of the discounted total reward is constrained.The difficulty of the problem is to prove the existence of the optimal policy when the variance is constrained.In this paper,when solving the problem of the existence of the optimal policy,we first derived the variance formula of the discrete time discounted Markov decision processes,and obtained the variance expression of the discrete time discounted Markov decision processes as follows:in other words,the variance can be regarded as the expected discounted total cost function with discounted factor α2 and cost function h(x,g).Then the constant constraint on the new variance expression is equivalent to the constant constraint on the new expected discounted total cost.Thus,the existence of the optimal policy of expected discounted total reward with variance constrained is transformed into the existence of optimal policy of expected discounted total reward with total cost constrained.In the constrained optimization problem of Markov decision processes,by using Lagrange multiplier method,it is proved that there is a randomized simple policy to maximize the expected discounted total reward,so as to obtain the existence of optimal policy for discrete time discounted Markov decision processes with variance constrained.Finally,an example of variance constrained is given to illustrate the conclusion.

Keywords/Search Tags:

Lagrange multiplier method, Randomized simple policy, Expected discounted total reward, Optimal policy

Related items

1	Optimal Overhaul Policy And Analysis Of M/G/1 Queueing System With Randomized Overhaul(p,Y)-Policy
2	Analysis Of M/G/1 Queueing System With Server Vacation And Min(N,D,V)-Control Policy
3	M^{Î» 1, Î» 2}/G/1 Queuing System With Variable Arrival Rate And Multiple Vacations Under Control Of The Minï¼ˆN,Vï¼‰-policy
4	Performance Analysis Of Queueing Systems With Bi-Level Randomized（p,N₁,N₂）-Policy
5	Period Of Mixed Dividend Policy Of The Classical Risk Model
6	Research On Dividends And Ruin Problems In The MAP-modulated Risk Model
7	M/G/1(Repairable)Queueing System Under Min(N,D,V)-Policy Control
8	Reliability And Maintennance Replacement Policy Of Cold Standby Systems With Two Units
9	Modeling And Analyzing A Queueing System With Admission Control And Maintenance Policy
10	M/G/1(Repairable)Queueing System With Adaptive Multistage Vacation And Min(N,V)-Policy