Research On Spectrum Allocation Technology Of UAV Flight Formation Based On Reinforcement Learning

Posted on:2021-06-19

Degree:Master

Type:Thesis

Country:China

Candidate:X L Zhou

Full Text:PDF

GTID:2492306047979669

Subject:Master of Engineering

Abstract/Summary:

PDF Full Text Request

With the development of unmanned aerial vehicle(UAV)technology and the popularization of its application,single UAV faces complex and changeable environment,and the difficulty of completing tasks independently becomes obvious.More and more attention has been paid to the research of multi UAV flight formation networking.Moreover,in the future,the unmanned combat platform in the information battlefield will use more frequency equipment,occupy a wide frequency band,the spectrum demand will grow rapidly,and the information chain will be disturbed.It is particularly important for the unmanned combat platform to use the frequency independently in the disturbed environment,and for the UAV formation to use the frequency to realize the on-the-spot cooperation.This paper mainly focuses on the research of spectrum allocation of UAV cluster network,and studies the algorithm of dynamic channel allocation and dynamic slot allocation based on reinforcement learning.Firstly,the networking mode and reinforcement learning algorithm of UAV cluster network are studied.The networking mode of UAV group includes fair competition and priority networking;reinforcement learning algorithm studies Q-learning and depth reinforcement learning algorithm,and compares and analyzes different reinforcement learning.Secondly,this paper studies the UAV cluster network under the fair competition network.According to the characteristics of dynamic channel allocation and dynamic time slot allocation,the environment mapping of reinforcement learning algorithm is carried out to match the interface of reinforcement learning algorithm.According to reinforcement learning algorithm,the traditional deep Q network reinforcement learning algorithm(DQN)and the improved algorithm(DQN+RC),which replaces the rolled in neural network with the reserve pool,are studied;In order to compare the performance of the algorithm,three evaluation indexes are used,which are average collision probability,average reward,channel utilization and slot utilization.The experimental results show that the improved algorithm has the best performance and the convergence speed is 500 steps faster than DQN.After convergence,the channel utilization and slot utilization are 3% higher than DQN algorithms.Finally,the spectrum interaction of the network with priority is studied.The main difference between the priority network and the fair competition network is that the priority network should be classified.There is priority allocation between UAVs,which can solve the problem that one UAV takes up too much spectrum resources in the allocation strategy.In this paper,M/G/1 queuing model is introduced to evaluate the delay of priority mechanism improvement.This paper presents an improved algorithm(DQN+LSTM)which combines reinforcement learning with short-term memory network to accelerate the convergence of the algorithm.The experimental results show that DQN+LSTM algorithm converges 2000 steps faster than DQN+RC,and the slot utilization rate is 8% higher after convergence.The priority network improved the UAV delay by 83%.

Keywords/Search Tags:

Deep reinforcement learning, Multiple UAV network, Independent decision making, Dynamic channel allocation, Dynamic time slot allocation

PDF Full Text Request

Related items

1	Research On Slot Allocation In Airport Based On Collaborative Dynamic Configuration
2	Research On Dynamic Time Slot Allocation Mechanism For VANET Under Urban Scenario
3	Dynamic Pricing And Seat Allocation For High-speed Railway Based On Deep Reinforcement Learning
4	Research On Dynamic Time Slot Allocation And Network Planning For Low-orbit Satellite Data Link
5	Research On Dynamic Offloading And Resource Allocation Of Multi-UAV-Assisted Mobile Edge Computing System Based On Deep Reinforcement Learning
6	Modeling Research On Missile Penetration And Combat Decision-making Based On Deep Reinforcement Learning
7	Dynamic Resource Allocation Of Aerial-based Relay Based On Deep Reinforcement Learning
8	Dynamic Time Slot Allocation Algorithm Of UAV Network Based On Link Prediction
9	Research On Load Optimal Allocation And Decision Making In Thermal Power Plants
10	Research On Slot Allocation Models And Algorithms In Ground Holding Policy