| On the modern battlefield,with the change of radar system and the increasement of radar working mode and the enhancement of radar anti-jamming capability,the battlefield environment becomes more complicated,and it is more difficult for the jammer to judge the radar working mode in real time based on the radar signals of reconnaissance.For any working mode of the radar,the jammer can choose a variety of interference patterns to jam it.The selection of traditional jamming methods depends on experience or template matching.It cannot guarantee that the selected jamming method is optimal.In order to improve the performance of jamming decision-making under the complex electromagnetic environment,a method of jamming decision-making based on supervised sampling reinforcement learning is proposed in this thesis.The jamming decision-making technology based on deep reinforcement learning is mainly studied in the thesis.On the basis of radar working mode,the jamming decision-making technology based on Deep Q-network(DQN)and Double Deep Q-network(DDQN)and its improvement method are respectively studied in this thesis.Simulation results show that this method has better decision performance.The main research work of this thesis is as follows.The process of radar jamming decision-making is analyzed,and the jamming decisionmaking model based on deep reinforcement learning is proposed.The radar working modes used commonly in the jamming decision-making model based on the deep reinforcement learning and the jamming methods that the jammer can adopt are separately analyzed.The jamming benefits obtained by the jammer after interference is studied,and the method of calculating the jamming benefits is deeply discussed.The radar signal characteristic parameters used commonly are analyzed,the signal characteristics with large differences are selected according to the characteristics of the radar working mode,and the selected signal characteristics are modeled.Several radar working pattern recognition methods used commonly and the process of identifying radar working patterns are studied.The process of identifying the radar working mode based on Back Propagation(BP)neural network is studied and the radar working mode recognition model based on BP neural network is constructed.Finally,the simulation experiments are carried out to compare the recognition effects of different methods.The simulation results showed that the working pattern recognition method based on BP neural network was less affected by the parameter measurement error,and the accuracy was high.The basic process of decision-making by DQN decision algorithm-Markov decision process,basic algorithm principles and models are analyzed.The principle of jamming decisionmaking based on DQN is discussed,and the steps of jamming decision-making based on DQN are given.Aiming at the imbalance of training samples caused by random sampling in the DQN algorithm,a supervised sampling method is proposed,and the DQN jamming decision-making method based on supervised sampling is studied.Finally,simulation experiments are carried out to verify the decision performance of DQN and its improved methods.The DDQN decision algorithm model and the principle of jamming decision-making based on DDQN are analyzed,and the steps of jamming decision-making based on DDQN are given.Aiming at the imbalance of training samples caused by random sampling in the DDQN algorithm,the supervised sampling method proposed in this thesis is used to improve original DDQN method,and the DDQN jamming decision-making method based on supervised sampling is studied.Finally,simulation experiments are conducted to verify the decision-making effectiveness of DDQN and its improved methods,and to compare the decision-making performance of DQN and its improved methods with DDQN and its improved methods. |