| In recent years,UAV(Unmanned Aerial Vehicle)has attracted much attention in wireless communication systems and network systems because of its low cost,high flexibility and wide range.The use of UAV act as an air communication platform,which can make the data transmission of long-distance communication more stable in emergency,the communication quality and coverage of the communication system will also be improved.However,when the UAV involves the high demand for highthroughput data transmission,spectrum that the system allocates to the UAV network may not be sufficient to provide the required quality of service.Moreover,if the location and task assignment of the UAV is unreasonable,it will directly affect the channel environment and reduce the transmission performance of the system.To solve this problem,a spectrum sharing model is proposed in this thesis,the ground network is introduced to provide additional spectrum for UAV network system,and use reinforcement learning algorithm to determine the optimal location and optimal mission of UAV,which maximize the data transmission rate.According to the number of antennas at the transceiver in the system model,this thesis studies the communication network performance of UAV in the aspect of single-antenna and MIMO(MultipleInput Multiple-Output)respectively.1.On the one hand,this thesis studies the flight scheduling of UAV when performing critical tasks in a single antenna system when there is a shortage of spectrum.Firstly,a spectrum sharing model of UAV network is established based on grid method,in this system,UAVs can be divided into relay UAVs that provide services for spectrum owners and sensing UAVs that use the obtained spectrum to perform disaster relief tasks.The ultimate goal of this project is to find the optimal position of UAV in the environment and optimize the transmission rate of the system.In order to maximize the overall data transmission rate of the system,a UAV flight scheduling scheme based on reinforcement learning is proposed in this thesis.It is analyzed from three aspects:UAV flight trajectory,primary and secondary network selection and communication mode switching in the system.The classical Q-learning algorithm is used to enable each UAV to learn the system environment online and gradually find the best state.In this scheme,the adaptive cooperative transmission data forwarding mode is adopted,and the amplification forwarding protocol is combined with the selective decoding forwarding protocol,so that the relay UAV can choose the mode according to the change of the channel state and improve the data transmission rate during flight.Simulation results show that the adaptive Q-learning algorithm proposed in this thesis can find the optimal position of UAV.Compared with amplified forwarding protocol and decoded forwarding protocol alone,the proposed UAV flight scheduling scheme can obviously improve network communication performance.2.On the other hand,this thesis studies the flight scheduling of UAV when performing critical tasks in MIMO system when there is a shortage of spectrum.The system model consists of the ground main network which provides spectrum and the sensor relay MIMO network which borrows spectrum.The sending end and receiving end are configured with multiple antennas,and the UAV adopts single antenna configuration.Multiple single-antenna UAVs form a distributed antenna array,which can be regarded as a multi-antenna relay to participate in the data transmission of the network.In order to improve the data transmission performance of the whole network system,this thesis proposes to dynamically optimize the mission strategy and network selection strategy of UAV based on Q-learning algorithm.And in order to avoid UAV falling into local optimum in the training process,this thesis improves the action selection strategy in the classical Q-learning algorithm,so that UAV has strong adaptability in the environment.The simulation results show that the flight scheduling scheme based on the improved Q-learning algorithm has a faster convergence speed than the classical Q-learning algorithm,and the UAV can determine its optimal position and task allocation more quickly.At the same time,due to the introduction of MIMO network,the data transmission performance of the system is obviously improved.In addition,the more MIMO antenna configurations,the greater the system gain. |