The confrontation between wireless communication and jamming is one of the main tasks of electronic warfare,and all countries attach great importance to the development of the countermeasure capabilities of cognitive electronic warfare systems.However,traditional methods have failed to achieve truly unmanned intelligent confrontation,and new methods of artificial intelligence are urgently needed to seek breakthroughs.In this thesis,reinforcement learning,multi-agent game theory,and opponent modeling are introduced into cognitive electronic warfare,so that both offensive and defensive parties have reasoning capabilities,adaptive attack or anti-attack capabilities.To solve the problem of unreasonable electromagnetic spectrum regulation caused by the difficulty of evaluating the jamming effect in the electronic countermeasure system,an intelligent algorithm based on cooperative training and non-cooperative countermeasure deep reinforcement learning is proposed.First,the jamming-anti-jamming game model is proposed,and the Nash equilibrium is analyzed.The actions,utility functions and cost constraints of the agents are designed.Secondly,in the cooperative training stage,the interactive data of the opponent is used to continuously update the deep Q network of the agent to optimize the output strategy.Finally,for the non-cooperative countermeasure where the enemy’s movements are unknown,the anti-jammer information feedback method and the jammer’s eavesdropping-clustering-action level mapping method are proposed to realize the enemy’s action estimation.The simulation results show that,compared with the non-intelligent party,the anti-jamming ability of the agent based on the algorithm proposed in this thesis is increased by 29.9%,and the jamming ability is increased by 24.3%.Compared with the non-cooperative training agent,the anti-jamming ability is increased by 23.0%,and the jamming ability is increased by15.8%,which proves the effectiveness of the proposed algorithm.In order to solve the problem of unsatisfactory decision-making effect caused by the lack of the agent’s advance planning function,an algorithm called opponent policy reasoning deep reinforcement learning for intelligent countermeasure is proposed.First,the method of variational reasoning is used to carry out the probabilistic recursive reasoning of the opponent’s strategy model to realize the estimation of the enemy’s potential actions.Then,the opponent modeling is integrated into the reinforcement learning training of the agent.The current state of the agent and the opponent’s strategic reasoning information are comprehensively used to optimize the agent’s decision-making and form an adaptive optimal response.The simulation results show that the anti-jamming ability of the agent after adding the opponent’s policy reasoning module increases by 27.4%,and the jamming ability increases by21.1%,which proves that the algorithm proposed in this thesis further improves the offensive and defensive capabilities of both parties in the game and enhances the learning ability of the agent. |