The Traveling Salesman Problem(TSP)is a well-known combinatorial optimization problem(COP)that has a profound impact on the development of modern industry.In recent years,the use of solvers with Deep Reinforcement Learning(DRL)technology has become a new trend due to their ability to solve instances in near real-time.This makes heuristic algorithms using DRL technology more practical for solving TSP under certain conditions rather than just as a theoretical tool.These COP heuristic algorithms using DRL technology are called Neural Combinatorial Optimization(NCO)methods.However,existing NCO methods still have some shortcomings compared to traditional solvers: 1.The high training cost of Deep Learning(DL)models;2.The performance of solvers obtained through NCO methods is generally worse than that of traditional solvers.These two shortcomings limit the practicality of NCO methods.Based on the existing NCO methods,this paper analyzes the characteristics of existing DRL technologies and designs a new NCO method to address the following two issues.The specific content involved is as follows:1.To address the issue of high training cost of the NCO model,this paper proposes a local attention mechanism based on sparse attention mechanism and applies it to the generative TSP solving model.By reasonable use of the unordered characteristics of TSP data,this method avoids the need for calculation between queries and keys at all positions in attention weight calculation,reducing the huge demand for memory capacity of the model during the training process.2.To address the issue of insufficient performance of the NCO model,this paper proposes a loss function based on equivalence constraints.This method improves the greedy rollout baseline in the existing loss function to induce the model to output equivalent solutions for equivalent TSP instances.In the comparative experiments conducted on multiple test sets with other related methods,the proposed method has shown a certain degree of competitiveness in terms of performance,computational efficiency,and memory usage. |