Font Size: a A A

An End-to-end Hierarchical Reinforcement Learning Framework For Large-scale Dynamic Flexible Job-shop Scheduling

Posted on:2023-12-16Degree:MasterType:Thesis
Country:ChinaCandidate:K LeiFull Text:PDF
GTID:2542307073981699Subject:Mechanical engineering
Abstract/Summary:PDF Full Text Request
Encountering today’s fierce market competition and huge energy consumption,the manufacturing industry bears enormous economic pressure and challenges of environmental sustainability.To meet the personalized requirements of customers and cost savings,enterprises have to constantly improve the scalability,flexibility,and reliability of manufacturing systems.However,this kind of manufacturing system with the characteristics of individuation,smallbatch orders,multi-variety,and so on usually faces the problems of order change,process procedure and personnel change,and machine damage in the actual processing process.As a result,many tasks cannot be processed according to pre-made schedules.For such sudden dynamic events,such as order insertion or cancellation,machine break,it is necessary to dynamically adjust the existing schedules to meet the completion time,delivery time and other optimization objectives.This kind of dynamic scheduling problem has the characteristics of high uncertainty and complexity,which requires the real-time decision-making ability and robustness of the manufacturing system.But it is more suitable for modern manufacturing system.To this end,it is of remarkable importance to develop an on-line scheduling method to handle these kinds of problems.In this paper,the flexible job-shop scheduling problem with dynamic job arrival is studied.Based on the superiorities of deep reinforcement learning(DRL)in the perception of state features and the decision-making ability of the policy,this research proposes an end-to-end hierarchical DRL framework for global optimization of the dynamic scheduling problem.The framework can solve the dynamic flexible job-shop scheduling problem(DFJSP)within dozens of seconds.That can be directly used on the real-time optimization of the production line.Due to the complexity of this study,we divide the complex problem into a couple of sub-problems for optimizing.The main contributions of this paper are as follows:1)Firstly,the job routing optimization problem is studied.An end-to-end deep reinforcement learning framework is proposed for solving a class of routing problems.A novel residual-edge-graph attention network is developed to embed the state features of the graph representations of the routing problems,and a decoder is designed based on the Transformer model for efficient predict the solutions.Two Actor-critic style DRL algorithms are designed to train the proposed encoder-decoder model,respectively.The performance of the proposed framework is tested based on the traveling salesman problem and vehicle routing problem.2)Then the DRL framework for static flexible job-shop scheduling is studied.The static problem is decomposed into two sub-tasks and modeled as multi-Markov Decision Processes(MMDPs).To solve the MMDPS,a graph-neural-network-based encoder is designed to encode the representation of the disjunctive graph of FJSP efficiently,and a Multilayer Perceptron-layer Perceptron(MLP)decoder is designed.To train the proposed model for learning efficient scheduling policy,a multi-Proximal Policy Optimization(multi-PPO)algorithm was proposed.The results of random generated and benchmark instances demonstrate that the proposed framework outperforms the heuristic algorithms in terms of running time and quality of solutions.3)To optimize the large-scale dynamic flexible job-shop scheduling problem,a novel endto-end hierarchical reinforcement learning framework is proposed based on the above research.The framework includes one higher-level agent and two lower-level agents.A DDQN is designed for a higher-level agent to learn to optimize the problem in a long perspective globally.A policy network based on encoder-decoder architecture is designed for two lower-level agents to optimize the static FJSP divided by a higher-level agent cooperatively.To evaluate the effectiveness and feasibility of the proposed framework,we design a simulator that simulates the dynamic environment to produce the instances for training and testing.The testing results verify the priority of the proposed framework in solving this kind of dynamic problem.Besides,we further analyze the feasibility of applying the proposed framework to other scheduling problems with different dynamic events.Overall,we exploit a real-time scheduling system encountering the large-scale dynamic environment in the practical production process.This research combines superiorities between the extraction of state feature performance of deep learning and the action decision ability of reinforcement learning.Our method hierarchically optimizes the DFJSP and provides a new manner to solve that kind of dynamic scheduling problem.Meanwhile,the proposed framework helps modern manufacturing enterprises to achieve the intelligent scheduling model and underlies the theory for the intelligent manufacturing system.
Keywords/Search Tags:Hierarchical reinforcement learning, large-scale dynamic scheduling, flexible job-shop scheduling, graph neural network, PPO algorithm, DDQN algorithm
PDF Full Text Request
Related items