Font Size: a A A

Researches On Perturbation Of Multi-Objective Markov Decision Processes

Posted on:2006-05-10Degree:MasterType:Thesis
Country:ChinaCandidate:X GongFull Text:PDF
GTID:2120360155455015Subject:Basic mathematics
Abstract/Summary:PDF Full Text Request
Multi-objective Markov decision processes is a theory on optimization in multistage decision processes in a changing environment. It is wisely used, and has been researched for about half a century. There have been lots of fruits, but all of these fruits focus on optimization equation or algorithms. But usually datum got by observation, experiment and measurement may be not exact, when we solve practical problems, so there may be errors in parameters. This paper will discusse perturbation problems on discrete time Markov decision processes with discounted criterias, discrete time Markov decision processes with average criterias and continuous time Markov decision processes with discounted criterias by two steps respectively.Firstly, this paper discusses optimal theory of above moldels. By far, there are only researches on existence of cone optimal strategies to discrete time Markov decision processes with discounted criterias, this paper will extend the optimal equation theory of MDP with average criteria and continuous time MDP with discounted criteria in [11], [12] to MOMDP with average criterias and continuous time MOMDP with discounted criterias, according to the theory on multi-objective problems in [3], [9], establish equations cone optimal strategies to above models meet, then establish the optimal theory of multi-objective Markov decision processes.Formerly, nearly all of the researches on perturbation theory in MOMDP were based on properties of transition probabilities. They explored how perturbation affected transition probabilities, and then established policy iteration algorithms basing on perturbation theory to find the satisfactory strategies (see [37], [38]). But there are only researches in MDP on how perturbation affect optimal strategies and optimal objection(see[2] , [4], [8]) .This paper will extend above perturbation theories in [2], [4], [8] to MOMDP , establish theories on whether cone optimal strategies are still cone optimal under given perturbation, and the bound of the perturbation to realize it. This paper reaches conclusions that perturbation degree of cone optimal objective vectors may change with that of transition probabilities or transition velocity.Therefore, we can solve practical problems better with the policy obtained...
Keywords/Search Tags:MOMDP, discounted criteria, average criteria, transition probability, transition velocity, perturbation, cone optimal, inventary management
PDF Full Text Request
Related items