| With the rapid development of IoT(Internet of Things),the application scenarios of wireless devices accessing the network has been greatly enriched.V2X(Vehicle to Everything)is one of the most typical scenarios.V2X is one of the main directions of smart cities in the future by interconnecting vehicles with other wireless devices.It improves traffic safety through data sharing.The problem of resource allocation in network communication has been studied for decades.Compared with traditional cellular networks,the network environment of V2X is more complex,with fewer resources that can be allocated,and more stringent requirements are placed.Model-free machine learning methods combined with deep neural networks have been widely used in various practical application scenarios with high complexity.Its powerful computing power and data-driven operation mode are gradually replacing traditional model-based algorithms.Traditional model-based power allocation algorithms require mathematical models to be analyzable and usually have high computational complexity.The limited computing resources limit the computational complexity.This thesis puts forward a model of V2X power allocation in a multibase-station power allocation problem in a V2X environment,and sorts out the physical interference factors of the V2X communication link.In addition,this thesis uses the Markov Decision Process to model the problem.On this basis,the top-level design of reinforcement learning is further analyzed mathematically,and four types of reinforcement learning algorithms are designed for V2X scenarios.They are REINFORCE algorithm,DQN algorithm,PR-DDQN algorithm and DDPG algorithm.In addition,this paper also designs a flexible and scalable system architecture for algorithm verification.Finally,this thesis builds a simulation environment that simulates real V2X communication,which compares and analyzes the three types of algorithms in multiple dimensions.The results show that the performance of the DDPG algorithm is the best,which verifies the feasibility of the reinforcement learning algorithm in the V2X scenarios. |