Flexible job shop is a type of production mode widely presenting in discrete manufacturing systems.Flexible job shop scheduling problem is of great value in research and industry.The related research gradually focuses on the approximate methods due to the complexity and diversity.Considering both the quality and efficiency of solution,improving generalization performance has become a common focus and long-term objective of the industry and academia.In recent years,approximate methods based on machine learning have shown potential,which can automatically learn and obtain better solutions.However,there are generally limitations in state representation,or dependence on expert knowledge,and existing research on other problems is difficult to directly apply to the flexible job shop scheduling problem.This thesis studies the optimal scheduling of flexible job shop scheduling problems based on deep reinforcement learning,proposes a Graph Neural Network Deep Reinforcement Learning algorithm,including model construction,state representation,network for feature extraction and decision-making,and training program,etc.The specific work is as follows.Firstly,this thesis analyzes the characteristics and hypothesis constraints of flexible workshop,and establishes the mathematical programming model for scheduling problem.Based on the principle of reinforcement learning,a Markov decision process model of flexible job shop scheduling problem is established,which provides a mathematical basis for reinforcement learning.Secondly,according to the characteristics of workshop and the limitations existing in the state representation of Markov decision process,a heterogeneous graph model is established by extending the traditional disjunctive graph,which can describe the workshop production state more comprehensively while reducing the graph density.And thirdly,a two-stage heterogeneous graph neural network is designed to embed non-Euclidean graph information into low dimensional feature vectors,realizing effective extraction of state features.Finally,the policy network is designed based on multi-layer perceptron,and the multiple embedded feature vectors describing the state of the operations,machines and workshop are connected into a vector,so that a variety of operations and machines combinations can be input into the policy network in parallel.From this a size-agnostic scheduling problem solving algorithm is designed.Generated instances and classical instances are used to verify the effectiveness of the algorithm in this thesis as well as analyze quality,efficiency and generalization performance.The test results show that,training and testing on small-scale and medium-scale scheduling problems,the efficiency of the algorithm in this thesis is slightly slower than that of the dispatching rules,but the time consumption is also in seconds,and the solution quality of the algorithm in this thesis is better than that of the dispatching rules on average.Directly using the model trained on small-scale and medium-scale scheduling problems to solve the classical examples and large-scale problems,this algorithm can also achieve better solution quality than the dispatching rules in second-level solving time,and shows better generalization performance.Compared with the experiments of related deep reinforcement learning-based methods on classical instances,this algorithm obtains better solution quality on average in second-level solving time.Compared with the OR-Tools solver,the efficiency of this algorithm is higher.As the problem scale increases,the solution quality of this algorithm can gradually approach until it surpasses the solver(solving in a reasonable time).According to the functional demand and performance demand,a flexible job shop scheduling software is designed and developed for operators and researchers.The software provides functions such as workshop instance scheduling,model performance testing,processing simulation,and model training.Running results show the application potential of the algorithm in this thesis. |