| The flexible job shop scheduling problem(FJSP)is a popular research topic in the field of production scheduling.It is an NP-Hard problem,and it is also one of the most critical problems in the process of industrial manufacturing and process planning.This problem requires assigning a group of workpieces to a group of machines for processing,and each workpiece is composed of multiple processes.There are certain sequential constraints between the processes.Each process can be assigned to multiple machines.Traditional FJSP ignores sequence-dependent setup times and resource constraints.However,these constraints should be taken into account during the manufacturing process.So this paper proposes the FJSP with setup times and resource constraints.This paper applies harris hawk optimization(HHO)algorithm and proposes a self-learning harris hawks optimization(SLHHO)algorithm combined with reinforcement learning algorithm for this problem.The works of this paper is as follows:(1)A discrete version of the HHO algorithm is implemented to solve FJSP with sequence-dependent setup time and resource constraints.The HHO algorithm has been mainly used to solve numerical optimization problems since it was proposed in 2019,and has achieved very good performance,while FJSP is a combinatorial optimization problem.This paper is the first one to use the HHO algorithm for FJSP with setup time and resource constraints.The optimization space of HHO algorithm is a continuous space,but the machine assignment and operate sequence in FJSP are discrete.Therefore,the position vector of real number obtained by the algorithm should be converted into machine assignment vector and process sequence vector represented by discrete integer to solve FJSP.(2)A decoding method is designed to avoid resource constraints as much as possible.The resource constraints in this paper mainly refer to the limited number of operators who can replace the tools for the machines at the same time.Therefore,when decoding the machine assignment vector and operate sequence vector into the scheduling plan,it is necessary to avoid the situation that the machine waits for the operator to be idle as much as possible.(3)Reinforcement learning algorithm is used to adjust the key parameters of the algorithm intelligently to improve the performance of the algorithm.According to the characteristics of the problem model of this paper,the state set and action set of reinforcement learning algorithm are reasonably divided,and the appropriate reward method and action selection strategy are set up. |