Font Size: a A A

Research On Stochastic Economic Lot Scheduling Problem Algorithm Based On Deep Reinforcement Learning

Posted on:2023-03-12Degree:MasterType:Thesis
Country:ChinaCandidate:N MiFull Text:PDF
GTID:2568306614487214Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
Intelligent manufacturing is internationally recognized as the breakthrough point of the next generation of industrial innovation.In order to realize intelligent manufacturing,the intellectualization of production process is essential,which poses new challenges to the artificial intelligence algorithm for production control,including the requirements for strong generalization ability and high flexibility of optimization algorithm in the field of dynamic production scheduling.In industrial production,the stochastic economic lot scheduling problem(SELSP)is a common and complex dynamic optimization problem.The key problem to be solved by the optimization algorithm is balancing production in advance and inventory level.In addition,it also includes the problem of solving the deviation between predicted demand and actual demand and the dynamic change of the quantity and/or type of products.The degree of solving the balancing problem reflects the optimization performance of the algorithm,and the degree of solving the latter two problems reflects the generalization ability and flexibility of the algorithm.Aiming at improving the optimization performance,generalization ability and flexibility,this thesis studies SELSP based on deep reinforcement learning(DRL),and proposes a DRL based algorithm to learn the dynamic scheduling strategy.The specific work is as follows.Firstly,the mathematical model of SELSP is established,with a biopharmaceutical production process being taken as an example.In order to use reinforcement learning algorithm to solve this problem,Markov decision process modeling is carried out,and a matrix based environment state representation method is proposed according to the characteristics of SELSP.Then the simulation environment is built according to the mathematical model and reinforcement learning model.Secondly,in order to deal with the parallel environment state information,a feature extraction network is designed based on the self attention mechanism to extract the features that are helpful for production decision-making from the raw state information,improving the ability of the DRL algorithm in solving the balancing problem and demand deviation problem.In addition,the ability of the feature extraction network for handling matrix state input enables the algorithm to deal with different quantities of products flexibly,which can adopt to the dynamic changes of product quantity and/or type.Then,actorcritic is selected as the DRL learning algorithm framework to improve the decision-making performance of the algorithm.And proximal policy optimization is selected to train the feature extraction and production decision network.Numerical experiments show that the scheduling optimization performance of this algorithm is satisfying.In addition,the algorithm can handle problems with deviation between the actual and predicted demand,as well as problems with changing product quantity,showing its strong generalization ability and flexibility.Finally,based on this algorithm,an intelligent production scheduling software is designed and developed to help enterprises flexibly formulate production scheduling plans,so as to reduce inventory costs and improve economic performance,showing the application value of the reinforcement learning algorithm proposed in this thesis.
Keywords/Search Tags:deep reinforcement learning, self-attention mechanism, dynamic optimization, stochastic economic lot scheduling problem, intelligent production scheduling software
PDF Full Text Request
Related items