Optimality Equation Of Continuous-time Markov Decision Process Based On Discount Criterion

Posted on:2010-06-11

Degree:Master

Type:Thesis

Country:China

Candidate:J Li

Full Text:PDF

GTID:2189360278960189

Subject:Probability theory and mathematical statistics

Abstract/Summary:

PDF Full Text Request

Continuous-time Markov decision model is widely used in actual work. How to choose an optimality policy of Markov decision depends on the adoption of decision criterion most. Average reward criterion and discount reward criterion are two widely used criterions. As now most literatures are about continuous-time Markov decision process based on average reward criterion, the research on continuous-time Markov decision process based on discount reward criterion is defective. The discussion on this problem in the paper fill up the blanks about the given of optimality conditions, establishing of optimality equation of continuous-time Markov decision process based on discount reward criterion and the character of optimality policy. On the other hand, it can provide people gist of decision when they solve a series problems coming down to discount reward during the economic decision process.The paper deals with theÎ±-discount reward criterion for continuous-time Markov decision processes in general state and action spaces when transition rates and the reward rates are allowed to be unbounded. In order to do research on this problem, these works should be done mainly:â‘ As the precondition for exist of optimality equation, the optimality conditions are given first. It concludes three assumes on the system's primitive data and two lemmas got from assumes.â‘¡According to the proofs to above optimality conditions, the existence of theÎ±-discount reward optimality equation can be proved, moreover a corresponding discount optimality stationary policy can be found during the process of proof. The policy iteration algorithm provided in the paper is based on the three assumption about the system's primitive data,so the assumption on relative difference of reward function is cancelled in order to keep the authenticity of system's primitive data.â‘¢In order to make the choice of policy to avoid the influence of randomicity and weaken the instability, under the given optimality conditions, the existence ofÎµ-average optimality stationary policy can also be ensured. It presents some properties of average optimality stationary policies, which is benefit to simplify the decision process.â‘£At last, an actual economic case about electronic business affairs is adopted to explain how theÎ±-discount reward optimality equation be used to solve such problems in detail. Then in order to illustrate the application in some other aspects, it makes simple illustrations on principle of model establishing and essential of problem, which explains that the optimality equation based on discount reward criterion really works on these problems effectively.

Keywords/Search Tags:

Continuous-time Markov Decision Processes, Optimality Equation, Optimality Policy, Character of Optimality Policy, Application Analysis

PDF Full Text Request

Related items

1	The Optimality Of The Friedman Rule
2	Fairness, Social Optimality and Individual Rationality in Agent Interactions
3	On Asymptotic Optimality In Linear Empirical Bayes Premium
4	Regional integration readiness of The Gambia: Empirical assessments of the optimality of the Sene-Gambia as a currency area and the trade facilitation effects of the Sene-Gambia Confederation on the Gambian economy
5	Stock Index Futures To The Chinese Stock Market
6	The Studying Of Urban Land Intensively Use In Wuhan Based On Economic Optimality Theory
7	Research On Engineering Schedule Management Of C Project
8	Study On The Optimality Of Tax Service In China
9	Study Of The Criteria For Allocation Of Social Income
10	Stability Analysis For A Cooperative Game Model