With the continuous development of Internet of vehicles technology,it has brought a lot of convenience to people’s travel life.Facing the rapid increase of on-board users,the Internet of vehicles will generate a large number of service request information.However,these large number of service requests need timely response while the information processing capacity of on-board mobile devices is limited.An effective way to solve this problem is to upload these service requests to cloud computing or edge computing system for processing.But at the same time,it also brings great challenges to the reasonable allocation of resources for the cloud system and edge system of the Internet of vehicles.Although there are many researches on the resource optimization of the Internet of vehicles,most resource allocation schemes need to meet some strong assumptions and cannot adapt to the dynamic environment of the Internet of vehicles.The hierarchical reinforcement learning algorithm only needs a small number of strong assumptions.Through the continuous interaction with the environment,it gradually learns the optimal action strategy and obtains the adaptive optimal resource allocation scheme.Therefore,this thesis uses hierarchical reinforcement learning algorithm to solve the resource allocation problem of Internet of vehicles in cloud computing and edge computing scenarios.Firstly,this thesis studies the resource allocation of the cloud system of the Internet of vehicles.Taking the resource constrained Internet of vehicles as the object,combined with cloud computing,this paper proposes a resource allocation model of Internet of vehicles cloud system based on semi Markov decision process,and uses hierarchical reinforcement learning algorithm to solve the model.Based on the above model,this thesis proposes an adaptive optimal resource allocation scheme.This scheme can not only ensure the service quality of the cloud system of the Internet of vehicles and the experience quality of vehicle users,but also maximize the revenue of the whole cloud system of the Internet of vehicles.The simulation results show that,compared with the scheme based on reinforcement learning algorithm and greedy algorithm,the scheme based on hierarchical reinforcement learning algorithm can better meet the needs of vehicle users and solve the resource allocation problem of the cloud system of the Internet of vehicles effectively.At the same time,it also can obtain certain system revenue improvement.Secondly,the resource allocation of the edge system of the Internet of vehicles is studied.An adaptive resource allocation scheme based on semi Markov decision process and hierarchical reinforcement learning algorithm is proposed.This scheme can balance the resource consumption and system revenue of the edge system of the Internet of vehicles,ensure the service quality of the system and the experience quality of the on-board users,and maximize the long-term system revenue of the edge system of the Internet of vehicles.In this scheme,the resource optimal allocation problem in the edge system of the Internet of vehicles is constructed as a two-layer model of " Markov decision process + semi Markov decision process".The upper Markov decision process is used to decide whether to receive the Internet of vehicles service request while the lower semi Markov decision process is used to reallocate the resources.And then the hierarchical model is solved by using the hierarchical reinforcement learning algorithm.Finally,the resource allocation scheme can be applied to the edge system with dynamic environment.Experimental results show that compared with the scheme based on reinforcement learning algorithm and greedy algorithm,this scheme can achieve significant performance gain.In this thesis,the resource allocation problem of vehicle networking in different scenarios is studied.By combining different hierarchical reinforcement learning algorithm with semi Markov decision process,the obtained resource allocation scheme can better solve the resource allocation problem of the Internet of vehicles in different scenarios. |