| With the explosive growth in mobile devices and mobile data traffic demands,the contradiction between demand and supply issues has become increasingly serious in spectrum resources.It has become a key issue in future mobile communication system that how to use limited spectrum resources to meet the demands of massive user devices for higher data rate and lower transmission delay.Heterogeneous wireless networks improve system capacity and spectrum efficiency by integrating different network architectures and combining spectrum sharing techniques.Unmanned aerial vehicle(UAV)has the characteristics of flexible deployment,autonomous mobility,and cost-effective management.The properties of on-demand deployment and nearby access of UAV-assisted Heterogeneous network(Het Net)help to meet the bursty capacity demand in hotspots.The problem of resource allocation in this network becomes more flexible and complex,and it has attracted research interests from both academia and industry.This thesis exploits deep reinforcement learning algorithm to tackle the resource allocation problem in a UAVassisted two-tier Het Net.The main contributions are as follows.(1)Downlink co-tier spectrum sharing in a UAV-assisted two-tier Het Net is consid-ered,where the fixed number of links in small cells are divided into high prior-ity user links and low priority user links.While multiple low priority user links try to access the shared spectrum preoccupied by the high priority user links.A Zero-Gamma multi-agent Actor-Critic deep reinforcement learning based resource allocation scheme is proposed for the spectrum sharing problem between different priority tiers of user links.Aiming at maximizing system capacity,the proposed scheme models the spectrum sharing problem as a reinforcement learning process and employs deep reinforcement learning algorithm to learn the optimal resource allocation scheme.By setting the discount factor γ as zero,the proposed scheme removes the future cumulative reward part in the learning target and fits the instant reward directly,which eliminates estimation error in deep reinforcement learning.All agents share a common global Critic,which simplifies the algorithm architec-ture and reduces the number of models.The simulation results show that the pro-posed scheme achieves a close performance to the upper bound but with near a75% reduction in model parameters.The convergence behavior,training stability,performance analysis,and strategy correctness are further analyzed to verify the effectiveness of the proposed scheme.(2)Uplink cross-tier spectrum sharing in a UAV-assisted two-tier Het Net is consid-ered,the number of user links in small cells is dynamically changing and the small cell user links try to reuse the spectrum resources shared by macro-cell user links.Try to complete the data transmission tasks of the small cell links under the qual-ity constraint of the macro-cell links.A coordination-mini-batch with action mask multi-agent deep reinforcement learning based resource allocation scheme is pro-posed for the spectrum sharing problem between small cell links and macro-cell links.In order to make the scheme scalable and robust to the varying number of links,all agents share a common global model.Due to independent state and collec-tive reward design,the proposed scheme is designed to run in a centralized training,distributed execution manner,which makes the agents collaborate implicitly.The simulation results show that the proposed scheme achieves better performance than any other baseline schemes and improves the performance of the whole system steadily.The convergence behavior,hyperparameter robustness,model applicabil-ity,and strategy correctness are further analyzed to verify the effectiveness of the proposed scheme. |