Font Size: a A A

Variance Optimization For Continuous-time Markov Decision Processes

Posted on:2021-03-23Degree:MasterType:Thesis
Country:ChinaCandidate:Y Q FuFull Text:PDF
GTID:2370330647960024Subject:Science
Abstract/Summary:PDF Full Text Request
This paper considers the variance optimization problem of average reward in continuous-time Markov decision process(MDP).It is assumed that the state space is countable and the action space is Borel measurable space.The main purpose of this paper is to find the policy with the minimal variance in the deterministic stationary policy space.Unlike the traditional Markov decision process,the cost function in the variance criterion will be affected by future actions.To this end,we convert the variance minimization problem into a standard(MDP)by introducing a concept called pseudo-variance.Further,by giving the policy iterative algorithm of pseudo-variance optimization problem,the optimal policy of the original variance optimization problem is derived,and by defining Variance difference formula,a sufficient condition for the variance optimal policy is given.Finally,we will demonstrate its application in queuing systems and birth-and-death processes with catastrophes.
Keywords/Search Tags:Continuous-time Markov decision process, Variance optimality of average reward, Optimal policy of variance, Policy iteration
PDF Full Text Request
Related items