Font Size: a A A

Maintenance Policy For Deteriorating System Based On Reinforcement Learning

Posted on:2012-10-07Degree:MasterType:Thesis
Country:ChinaCandidate:Y M GuoFull Text:PDF
GTID:2132330335461603Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Due to the influence of running time and environment in industrial production, the state of system is continuous deteriorating, both the work efficiency and performance are gradually decreasing. When it cannot satisfy the work requirement, we regard it as failure even if the system can still work. The failure will cause great economic losses. A prior maintenance is use one kind or series kinds of maintenance work to find or exclude potential failure, so that the system can keep working in a good condition and avoid system from failing. Through this method, it can make profound influence in reducing production cost and making industrial production. Therefore, how to schedule the maintenance to avoid system running under a higher cost state and improve the system reliability and security is an important research topic.Based on the reinforcement learning, first, this paper establishes the model which aims at maintenance policy as to the deteriorating system during discrete state and continuous time by the Semi-Markov Decision Process. In order to escape local optimal result, an algorithm which combines the concept of Q-learning and simulated annealing is proposed in this article to get the optimal maintenance policy. We obtain the optimized results in both average and discount criteria by the simulation method, and discuss the influence of inspection interval on the optimized average cost by the emulation data.At the same time, this paper also considers the partially observed deteriorating system which exists the inspection errors, and establishes the model by the partially observed semi-Markov decision process. From aspects of no-memory and based memory, using Sara (λ) algorithm and NSM algorithm to solve the maintenance policy of deteriorating system respectively, obtain the optimized results in both average and discount criteria by the simulation method. The conclusion of inspection interval and optimized average cost is same as complete observed problem. At the last, this paper also discuss the influence of different values k in NSM , which is in accordance with the fact.
Keywords/Search Tags:Deteriorating System, Maintenance Policy, Reinforcement Learning, SMDP, POSMDP
PDF Full Text Request
Related items