Font Size: a A A

State space reduction for hierarchical policy formation

Posted on:2004-02-25Degree:M.SType:Thesis
University:The University of Texas at ArlingtonCandidate:Asadi, MehranFull Text:PDF
GTID:2469390011473209Subject:Computer Science
Abstract/Summary:
This thesis provides new techniques for abstracting the state space of a Markov Decision Process (MDP). These techniques extend one of recent minimization models, which is known as epsilon-reduction, to construct partition space that has a smaller number of states than the original MDP. As a result, learning of policies on the partition space should be faster than on the original state space.; The technique presented here is to execute a policy instead of executing a single action, and to group all states which have a small difference in transition probabilities and reward function under a given policy. This method is similar to a SMDP by expanding the actions on the original MDP.; When the reward structure is not known, the reward independent method is introduced for state aggregation. The reward independent method for state reduction is applied when reward information is not available and a theorem in this thesis proves the solvability for this type of partitions.; Simulation of different state spaces shows that the policies in both MDP and this representation are very close and the total learning time in partition space in our approach is much smaller than the total amount of time spent on learning in the original state space.
Keywords/Search Tags:State space, Original MDP, Policy, Reward independent method
Related items