Font Size: a A A

Applied Research On Resource Allocation With Approximate Dynamic Programming

Posted on:2015-01-01Degree:MasterType:Thesis
Country:ChinaCandidate:Y J JinFull Text:PDF
GTID:2250330428998555Subject:Management Science and Engineering
Abstract/Summary:PDF Full Text Request
Most of resource allocation problems have discrete or continuous state and decisionspace. Generally, we use Dynamic Programming (DP), Variational Inequality andMaximum Principle to solve small and medium-sized problems. For the large-scaleresource allocation problem, the traditional methods may face the “curse of dimensionality”no matter whether there is a model. The operation time of classical DP shows exponentialgrowth rates as the size of problem increases, while Variational Inequality cannot solve theoptimization problem with a closed set of constraint, and the Maximum Principle onlygives the necessary condition of optimality. But Approximate Dynamic Programming(ADP), which combines reinforcement learning, neural network, adaptive evaluationsystem and basic principle of classical DP can solve the complicated nonlinear problem.ADP also can avoid the “curse of dimensionality” through computing approximationfunction effectively, and overcomes the defects of other methods. Using the combination ofonline and offline simulation training method, it can adapt to the change of systemparameters in real time, so ADP has won extensive research in recent years.In this paper, we established a reasonable mathematical model with the approximatedynamic programming for general high dimensional discrete resource allocation problem,and proposed a model-based Actor-Critic algorithm. At last, we verified the validity of themodel and the convergence of the algorithm through two instances.The first instance is car rental problem. We should allocate the cars of all locations inrental company reasonably to obtain the maximum total profit on the condition of keepingthe car holdings change smoothly. Then we did the sensitivity analysis for each parameteraccording to the optimal value function and constraint conditions in the process ofcomputer simulation, and defined an indicator for customer service evaluation to getexpansion plans. We found that the profit increases as the maintainability become more powerful, but when the maintainability increases to a certain degree, the profit would benever increase anymore because of the increased costs. At last we got the best carscheduling strategy to achieve a “steady state” using the algorithm of policy improvement.The second instance is the portfolio problem with trading costs. It faces the “curse ofdimensionality” of uncertain factors such as amount, transaction costs, turnover ofinvestment, yields and some external information. First, we set up a two-stage ADP modelfor the long-term portfolio problem. In the first stage we divided the entire problem intomany stages by the time series, then solved a series of linear programming modelaccording to the principle of maximum profit and obtained the optimal stock holdings. Inthe second stage we used the fixed capital dynamic programming model and piecewiselinear function approximation method to solve iteratively in each time period, and we gotthe portfolio strategy for each period under different risk factors.
Keywords/Search Tags:Approximate Dynamic Programming, Resource Allocation, Car Rental, Investment Portfolio, Optimal Policy
PDF Full Text Request
Related items