Look-ahead Control For CSPS Model Based On Learning

Posted on:2008-07-09

Degree:Master

Type:Thesis

Country:China

Candidate:H Wu

Full Text:PDF

GTID:2178360215950903

Subject:Computer application technology

Abstract/Summary:

PDF Full Text Request

The Conveyor-Serviced Production Station (CSPS) is a very important model in real-world practical production. It is also a classical problem in the field of IE (Industrial Engineering) and OR (Operations Research). With the prevalence of the workflow assembly line, the research of CSPS problem is meaningful. Depending on the specific characteristics of the CSPS problem, we could model it as a Markov Decision Process (MDP) or Semi-Markov Decision Process (SMDP) in DEDS domain and solve the optimal control issue by using dynamic programming and reinforcement learning methods. Markov performance potential provides a new theory framework for the optimization of MDP/SMDP. Examining the definition of the performance potential sample path, we can combine the reinforcement learning and rollout methods naturally to enrich the algorithms for optimizing CSPS system.Look-ahead Control is an important method to deal with the CSPS problem; in other words, by means of the information of the production station and the conveyor, the system can predict a reasonable action. This paper studies the Look-ahead Control of the CSPS model, based on performance potential theory. First the study examines the CSPS problem with the unload time, which is the time it takes for a part to be taken from the conveyor. The study describes the CSPS system as a SMDP and deduces a series of formulas for some important parameters. After knowing the model parameters of SMDP, the study examines the policy iteration based on the performance potential for CSPS. Secondly, according to the definition of the performance potential sample path the study provides the potential-based Q-learning formulas and optimal algorithms. Meanwhile, the study examines the CSPS model based on Rollout Algorithm, which is unified under both average and discount cost criteria, and present relative formulas and optimal algorithms. The perturbation technique and historical information are used to improve the Rollout Algorithm. It shows that the model-free characteristic of Q-learning and Rollout Algorithm is an advantage of optimizing the real-world practical production problem. Lastly, this paper offers several production examples, and compares the results of the three algorithms, analyzes the influences to the system of several main parameters, and compares the results. It shows that the algorithms are effective.

Keywords/Search Tags:

Conveyor-Serviced Production Station (CSPS) Model, Look-ahead Control, Semi-Markov Decision Process (SMDP), Reinforcement Learning, Performance Potential, Rollout Algorithm

PDF Full Text Request

Related items

1	Optimal Control Model And Method For Single Conveyor-Serviced Production Station With Multi-Type Products
2	Robust Control Of Single Conveyor-Serviced Production Station
3	Optimization Control Approach Of CSPS Based On Event And Stochastic Demand
4	Performance Potential-based NDP Optimization Approaches And Application Research For SMDP
5	RBF-Q Learning Optimization Algorithm Of Conveyor-serviced Production Station With Multi-type Products
6	Parallel Algorithms For Large-Scale Markov Decision Processes Based On Performance Potentials
7	Continuous-Time Unified MAXQ Algorithm And Its Application
8	Optimal Control Of A Conveyor-Serviced Production Station With Dynamic Pickup Point Without Considering Delay Waiting
9	Reinforcement Learning Algorithms For Semi-markov Decision Processes
10	Continuous Time Hierarchical Reinforcement Learning Algorithm