New methods for dynamic programming over an infinite time horizon

Posted on:2003-07-14

Degree:Ph.D

Type:Dissertation

University:Stanford University

Candidate:O'Sullivan, Michael Justin

Full Text:PDF

GTID:1460390011484937

Subject:Operations Research

Abstract/Summary:

Two unresolved issues regarding dynamic programming over an infinite time horizon are addressed within this dissertation. Previous research uses policy improvement to find a strong-present-value optimal policy in such systems, but the time complexity of policy improvement is not known. Here, a method is presented for substochastic systems that breaks the problem of finding a strong-present-value optimal policy into a number of smaller dynamic programming subproblems. Each of the subproblems may be solved using linear programming, giving the entire process a polynomial running time with regard to the size of the original dynamic programming problem. Also, for a specialization of a substochastic system, other solution methods may be applied and the time complexity becomes strongly polynomial.; For normalized systems, policy improvement is still the method of choice. However, policy improvement requires the solution of linear systems that may not always have full rank. In a finite precision environment, the stability of solution methods for these linear systems is critical. One may simplify the computations associated with policy improvement by classifying the states and considering each class separately. Here, a method is presented that applies to any policy with substochastic classes. The method uses the state classification to break the linear system into many smaller linear systems that are either of full rank or rank-deficient by one. Each of the smaller linear systems is then solved in a numerically stable way.; During the development of the previous numerically stable method, the need for a sparse rank-revealing LU factorization became apparent. No such factorization exists in current literature. Here, a new sparse LU factorization is presented that uses a threshold form of complete pivoting. It is found to be rank-revealing for all but the most pathological matrices.

Keywords/Search Tags:

Dynamic programming, Time, Policy, Method, Uses, Linear systems

Related items

1	Dynamic Programming Problem In Mathematics Modelling
2	High-dimensional adaptive dynamic programming with mixed integer linear programming
3	Research On The Optimal Control Of Single-Phase Photovoltaic Systems Based On Adaptive Dynamic Programming
4	PI-based Self-learning Optimal Control Of Linear Singularly Perturbed Systems
5	Indefinite Linear Quadratic Optimal Control For Discrete-Time Uncertain Systems
6	L₁-Gain Performance Analysis Of Discrete-Time Positive Linear Time-Delay Systems And Its Applications
7	Researches On Optimal Control Of Nonlinear Systems Based On Approximate Dynamic Programming
8	Optimal Control Of Nonlinear Systems With Time Delays Based On Adaptive Dynamic Programming Approach
9	Dynamic Programming Model And Numerical Solution For Optimal Monetary Policy
10	A New Method To Fuzzy Constraints Linear Programming