Font Size: a A A

Optimal Control For Discrete-Time Markov Processes: New Optimality Conditions And Approaches

Posted on:2006-12-28Degree:DoctorType:Dissertation
Country:ChinaCandidate:Q X ZhuFull Text:PDF
GTID:1100360182960542Subject:Probability and Statistics
Abstract/Summary:PDF Full Text Request
This PH. D. thesis is mainly concerned with some important problems in discrete-timeMarkov decision processes(DTMDP). These problems include: (1) Limsup and liminf aver-age optimality in a denumerable space. (2) Average optimality for DTMDP in Borel spaces:the existence of an average optimal stationary policy and its value iteration algorithm, andits characterization. (3) Average sample-path optimality for DTMDP in Borel spaces. (4)Variance optimality for DTMDP in Borel spaces. (5) Strong n(n = ?1,0)-discount opti-mality for DTMDP in Borel spaces. The results of this thesis improve, extend, unify andcomplement a number of existing results, and they are also applied to a number of casesnot covered by known ones in the previous literature. Some interesting examples such asan inventory system and controlled queueing models are used to illustrated our conditionsand results. This PH. D. thesis is composed of seven chapters.In Chapter I, we introduce the historical background, the subject and the recent devel-opment of DTMDP. The main results of this thesis are also brie?y introduced.Chapter II deals with limsup and liminf average optimality in a denumerable space.We give a new set of conditions and under which the existence of an optimal stationarypolicy for the two average criteria is ensured. The results in this chapter are applied to anadmission control queueing model.In Chapter III, we study DTMDP with Borel state and action spaces. The criterion tobe minimized is average expected costs. We first provide "two optimality inequalities withopposed directions", and give conditions for the existence of solutions to the two inequalities.Then, from the two inequalities we ensure the existence of average optimal (deterministic)stationary policies under additional continuity-compactness assumptions. Our conditionsare weaker than those in the previous literature. In particular, some new su?cient conditionsfor the existence of average optimal stationary policies are imposed on the primitive dataof the model. Moreover, our approach is slightly di?erent from the well-known "optimalityinequality approach"widely used in DTMDP. On the other hand, we also further study someproperties of optimal policies. We not only obtain two necessary and su?cient conditionsfor optimal policies, but also give a "semimartingale characterization"of an average optimalstationary policy. Finally, we apply our results to a controlled queueing system as well asthe generalized Potlach process with control.Chapter IV is concerned with the value iteration algorithm for average cost Markovdecision processes in Borel state and action spaces. The costs may possibly have neitherupper nor lower bounds instead of only nonnegative (or bounded below) widely used inthe previous literature. Moreover, our conditions are weaker than those in the previousliterature.Chapter V deals with DTMDP with average sample-path costs (ASPC) in Borel spaces.We propose new conditions for the existence of ε-ASPC-optimal (deterministic) stationarypolicies in all randomized history dependent policies. Our conditions are weaker than thosein the previous literature. In particular, the stochastic monotonicity condition in this chapterhas been first used to study the ASPC criterion. Moreover, the approach provided hereis slightly di?erent from the "optimality equation approach"widely used in the previousliterature. On the other hand, under a mild assumption we prove that ASPC-optimalityand average expected cost optimality are equivalent. Finally, we use a controlled queueingsystem to illustrate our results.In Chapter VI, we study DTMDP with variance-minimization in Borel spaces. Thecosts may possibly have neither upper nor lower bounds. We propose another set of condi-tions on the system's primitive data, and under which we prove the existence of variance-minimization in the class of average expected cost (AEC) optimal stationary policies. Itshould be noted that the approach provided here is slightly di?erent from the "optimalityequation approach"widely used in the previous literature. Finally, we use a controlledqueueing system to illustrate our results.In Chapter VII, we study discrete-time Markov decision processes with average expectedcosts (AEC) and discount-sensitive criteria in Borel state and action spaces. We proposeanother set of conditions on the system's primitive data, and under which we prove that (1)AEC optimality and strong ?1-discount optimality are equivalent; (2) a condition equivalentto strong 0-discount optimal stationary policies; (3) the existence of strong n(n = ?1,0)-discount optimal stationary policies. Our conditions are weaker than those in the previousliterature. Moreover, we provide a new approach to prove the existence of strong 0-discountoptimal stationary policies. It should be noted that our way is slightly di?erent from thosein the previous literature because the latter is via bias optimality, which we do not use atall. Finally, we apply our results to an inventory system and a controlled queueing system.
Keywords/Search Tags:discrete-time Markov decision process, optimal stationary policy, value iteration algorithm, average criterion, variance criterion
PDF Full Text Request
Related items