Researches On Perturbation Of Multi-Objective Markov Decision Processes

Posted on:2006-05-10

Degree:Master

Type:Thesis

Country:China

Candidate:X Gong

Full Text:PDF

GTID:2120360155455015

Subject:Basic mathematics

Abstract/Summary:

Multi-objective Markov decision processes is a theory on optimization in multistage decision processes in a changing environment. It is wisely used, and has been researched for about half a century. There have been lots of fruits, but all of these fruits focus on optimization equation or algorithms. But usually datum got by observation, experiment and measurement may be not exact, when we solve practical problems, so there may be errors in parameters. This paper will discusse perturbation problems on discrete time Markov decision processes with discounted criterias, discrete time Markov decision processes with average criterias and continuous time Markov decision processes with discounted criterias by two steps respectively.Firstly, this paper discusses optimal theory of above moldels. By far, there are only researches on existence of cone optimal strategies to discrete time Markov decision processes with discounted criterias, this paper will extend the optimal equation theory of MDP with average criteria and continuous time MDP with discounted criteria in [11], [12] to MOMDP with average criterias and continuous time MOMDP with discounted criterias, according to the theory on multi-objective problems in [3], [9], establish equations cone optimal strategies to above models meet, then establish the optimal theory of multi-objective Markov decision processes.Formerly, nearly all of the researches on perturbation theory in MOMDP were based on properties of transition probabilities. They explored how perturbation affected transition probabilities, and then established policy iteration algorithms basing on perturbation theory to find the satisfactory strategies (see [37], [38]). But there are only researches in MDP on how perturbation affect optimal strategies and optimal objection(see[2] , [4], [8]) .This paper will extend above perturbation theories in [2], [4], [8] to MOMDP , establish theories on whether cone optimal strategies are still cone optimal under given perturbation, and the bound of the perturbation to realize it. This paper reaches conclusions that perturbation degree of cone optimal objective vectors may change with that of transition probabilities or transition velocity.Therefore, we can solve practical problems better with the policy obtained...

Keywords/Search Tags:

MOMDP, discounted criteria, average criteria, transition probability, transition velocity, perturbation, cone optimal, inventary management

Related items

1	Study On The Transition To Turbulence In Slip Channel Flow
2	Model Selection And Hypothesis Testing
3	The Uniqueness、Perturbation Bound And Algorithm Of The Stationary Probability Matrix For Transition Probability Tensors
4	Separability Criteria And Entanglement Criteria Of Quantum State
5	Optimality Of Several Kind Of Important Designs Under Q And Q_B Criteria
6	The Research On Marine Riser Design Criteria
7	The PageRank Algorithm Based On The Transition Probability
8	Optimal Investment And Reinsurance Strategies Under Mean-Variance Criteria
9	Study On Dynamic Average Outgoing Quality On CSP-2-P
10	Eigenvalue Criteria For Existence Of Multiple Positive Solutions Of Boundary Value Problems For Ordinary Differetial Equations