Font Size: a A A

Research On Spectrum Allocation Technology Of UAV Flight Formation Based On Reinforcement Learning

Posted on:2021-06-19Degree:MasterType:Thesis
Country:ChinaCandidate:X L ZhouFull Text:PDF
GTID:2492306047979669Subject:Master of Engineering
Abstract/Summary:PDF Full Text Request
With the development of unmanned aerial vehicle(UAV)technology and the popularization of its application,single UAV faces complex and changeable environment,and the difficulty of completing tasks independently becomes obvious.More and more attention has been paid to the research of multi UAV flight formation networking.Moreover,in the future,the unmanned combat platform in the information battlefield will use more frequency equipment,occupy a wide frequency band,the spectrum demand will grow rapidly,and the information chain will be disturbed.It is particularly important for the unmanned combat platform to use the frequency independently in the disturbed environment,and for the UAV formation to use the frequency to realize the on-the-spot cooperation.This paper mainly focuses on the research of spectrum allocation of UAV cluster network,and studies the algorithm of dynamic channel allocation and dynamic slot allocation based on reinforcement learning.Firstly,the networking mode and reinforcement learning algorithm of UAV cluster network are studied.The networking mode of UAV group includes fair competition and priority networking;reinforcement learning algorithm studies Q-learning and depth reinforcement learning algorithm,and compares and analyzes different reinforcement learning.Secondly,this paper studies the UAV cluster network under the fair competition network.According to the characteristics of dynamic channel allocation and dynamic time slot allocation,the environment mapping of reinforcement learning algorithm is carried out to match the interface of reinforcement learning algorithm.According to reinforcement learning algorithm,the traditional deep Q network reinforcement learning algorithm(DQN)and the improved algorithm(DQN+RC),which replaces the rolled in neural network with the reserve pool,are studied;In order to compare the performance of the algorithm,three evaluation indexes are used,which are average collision probability,average reward,channel utilization and slot utilization.The experimental results show that the improved algorithm has the best performance and the convergence speed is 500 steps faster than DQN.After convergence,the channel utilization and slot utilization are 3% higher than DQN algorithms.Finally,the spectrum interaction of the network with priority is studied.The main difference between the priority network and the fair competition network is that the priority network should be classified.There is priority allocation between UAVs,which can solve the problem that one UAV takes up too much spectrum resources in the allocation strategy.In this paper,M/G/1 queuing model is introduced to evaluate the delay of priority mechanism improvement.This paper presents an improved algorithm(DQN+LSTM)which combines reinforcement learning with short-term memory network to accelerate the convergence of the algorithm.The experimental results show that DQN+LSTM algorithm converges 2000 steps faster than DQN+RC,and the slot utilization rate is 8% higher after convergence.The priority network improved the UAV delay by 83%.
Keywords/Search Tags:Deep reinforcement learning, Multiple UAV network, Independent decision making, Dynamic channel allocation, Dynamic time slot allocation
PDF Full Text Request
Related items