| In recent years,with the rapid development of Internet of Things(IoT)technology and the deployment of a large number of base stations,a new network architecture has emerged,namely the Ultra Dense Network(UDN).The quantity of Machine Type Communication Devices(MTCDs)in this network has demonstrated a remarkable surge.The aforementioned scenario presents formidable challenges to the large-scale access of MTCDs and the resource allocation in UDN,as it confronts issues ranging from access conflicts,delays,power and spectrum allocation,to energy consumption.More advanced access technology and more efficient schemes on resource allocation are urgently needed to meet various application scenarios in UDN and ensure the reliability and stability of the network.This thesis proposes a learning-based ordered competition access scheme for MTCDs with low latency requirements,as well as a model-driven resource allocation scheme for UDN scenarios.To achieve this,deep reinforcement learning is combined with the proposed approach.The main contributions of this thesis are summarized below:(1)During the random access process,MTCDs may experience delay caused by collision avoidance due to limited preamble resources,which contradicts the low latency requirements.To address this issue,this thesis proposes an ordered competition mechanism to transform random competition into ordered competition,thereby reducing conflicts during access.The ordered competition mechanism employs a two-step access process accomplished through learning and queuing.In this thesis,parameters associated with random access,such as arrival rate,latency requirements,and device quantity,are defined as queuing factors.These queuing factors guide MTCDs with varying low latency requirements to queue and then learn to select the optimal access slot and preamble.(2)We have proposed a model-driven reinforcement learning algorithm to address the problem of slow and difficult convergence faced by data-driven reinforcement learning.The model-driven framework of this algorithm differs from the data-driven framework,as it allows for one-to-one modeling specific to each problem.In this framework,we first determine the target optimization function,and then use an augmented Lagrangian function to modify it.Subsequently,by alternately optimizing the input to the neural network structure using the alternating direction method of multipliers,the optimal solution can be obtained with fewer iterations.(3)To investigate the resource allocation problem in ultra-dense networks with limited channel state information,a new model-driven learning framework is designed in this thesis.The framework is used to solve the resource allocation problem,including base station selection,power,and subcarrier allocation optimized through alternating direction method of multipliers.The deep reinforcement learning algorithm is used to optimize weights and solve the objective function,improving the algorithm performance.The framework reduces communication overhead by using effective channel state information instead of redundant information,and strengthens the constraint on the minimum user service quality requirement parameters to maximize small cell spectral efficiency while ensuring user experience. |