| As the main application scenario and key research object of B5G/6G,the Internet of Vehicles(Io V)has been widely recognized for its important role in developing intelligent transportation,improving urban operation efficiency,and improving people’s life experience.In the Io V where people,vehicles and roads coexist,there are not only traditional communication services that meet the needs of various human activities,but also various new services that ensure vehicle driving safety and support vehicle-road coordination.Due to the huge differences in the requirements of different services on key performance indicators such as transmission bandwidth and transmission delay,how to efficiently allocate wireless network resources and meet differentiated service requirements has become a research hotspot in the Io V.This thesis studies the problem of task scheduling and resource allocation in different service scenarios for various application requirements in the Io V,and designs the intelligent solutions based on Deep Reinforcement Learning(DRL).The main research work of the thesis is as follows:(1)In view of the dual high requirements of bandwidth and timeliness in real-time intelligent applications in the Io V,the task intelligent scheduling algorithm is studied.The thesis first analyzes the distributed machine learning(DML)process in which vehicles perceive road information and train local models,and roadside units(RSU)aggregate multi-vehicle model parameters and deliver them.The scheduling problem of vehicular nodes is modeled by the Markov Decision Process(MDP),and according to the service characteristics,the timeliness of the local model,the effective sensing area of the local node and the delay are designed.The system utility function is proposed,and a vehicular node scheduling scheme based on the deep Policy Gradient(PG)algorithm is proposed.A complete Monte Carlo episodic simulation is constructed in the thesis,and the experimental results show that the proposed algorithm has advantages in the overall system utility.(2)For scenarios of multiple levels of Enhanced Mobile Broadband(e MBB)services coexist in the Io V,network slicing technology that can provide differentiated customized services is introduced,and the dynamic optimization problem of intra-slice task scheduling and inter-slice resource allocation is studied.According to the time interval difference between service scheduling decisions within slices and resource allocation decisions between slices,this thesis proposes a Heterogeneous Markov Decision Process(HMDP)model suitable for multi-dimensional asynchronous decision joint optimization problems.The Dynamic Bayesian network(DBN)describes the state and action correlation of the HMDP model,and focuses on the analysis of the impact of the intersection of the upper and lower sub-processes on the Markov property.On this basis,a layered DRL architecture is designed according to the structural characteristics of HMDP,and the deep policy gradient algorithm is used to achieve joint decision optimization.The layered DRL-based joint algorithm of multiple levels e MBB task scheduling and network slice resource allocation is proposed.The simulation results show that the proposed algorithm can improve the system performance evaluation indicators related to the task profit of being served,the queuing delay cost,power consumption cost and the frequency resources cost in the joint optimization of task scheduling and slice resource allocation for multiple levels e MBB services.(3)Considering that the Ultra Reliable and Low Latency Communication(URLLC)service is an important service type that supports vehicle control in the Io V,the thesis further studies joint dynamic optimization problem for scheduling and resource allocation in the mixed coexistence scenario of URLLC and e MBB services.According to the low-latency requirements of URLLC services,the concept of mini-slots in 5G NR is introduced into the system model to support that for URLLC services and e MBB services can be scheduled at different time intervals respectively.At the same time,in order to consistency with the system process,the HMDP model is adjusted so that the MDPs corresponding to inter-slice resource allocation,e MBB service scheduling,and URLLC service scheduling are executed completely asynchronously.On this basis,due to the high-frequency scheduling decisions of URLLC services,a layered DRL architecture is designed,which combines the classical scheduling algorithm and the deep policy gradient algorithm,and an intelligent joint resource allocation and service scheduling algorithm is proposed.The simulation results show that the proposed algorithm can improve the system performance evaluation indicators related to the task profit of being served,queuing delay and the cost of frequency resources in the joint optimization of service scheduling and slice resource allocation in the mixed URLLC and e MBB service scenarios,and can ensure the ultra-low latency requirements of URLLC services. |