| With the continuous development of satellite and terrestrial network technology,in order to meet people’s increasing demand for enormous link,low delay and high reliability applications,the space-air-ground integrated network(SAGIN)has become the key technology in 6G network.However,the resources related to communicating,computing and caching on the satellites are very limited.In addition,some network bandwidth,computing and caching resources will be used to maintain the normal functioning of the satellite,which makes the satellite resources more valuable than the ground resources.How to dynamically and effectively allocate resources in SAGIN according to the needs of users on the ground is a major challenge for SAGIN applications.In addition,considering that the space-air-ground integrated network integrates a variety of satellite,aerial and ground networks,how to uniformly manage and allocate the heterogeneous network resources in SAGIN,so as to ensure that the application can be delivered to users with low end-to-end delay,is another challenge faced by the current SAGIN applications.This paper studies the resource allocation problem considering the user end-to-end delay in SAGIN,we adopt Deep reinforcement learning(DRL)algorithm in this paper to provide a green,economical and reliable resource allocation scheme for satellites and UAV clusters in SAGIN,and reduce the end-to-end delay of the overall process of providing services to users on the premise of meeting the relevant constraints of resources(network bandwidth,computing resources,cache).The main contributions of this paper are as follows:(1)Joint Data Streaming and Resource Allocation for SatelliteTerrestrial Networks via Deep Reinforcement Learning.We consider a scenario of satellite and ground networks,and our proposed scheme plans the process of streaming data from satellites to the ground,and makes full use of the communicating,computing and caching resources in the network to reduce the overall end-to-end delay.The joint data streaming and resource allocation problem in this paper is modeled as a Markov decision process(Markov decision process,MDP).To solve this problem,this paper proposes a deep reinforcement learning algorithm.After training,the algorithm gives the best data transmission mode to serve user needs.At the same time,according to different modes,the algorithm allocates resource for GEOs and LEOs.Simulation results show that the TD3 algorithm in this paper has great advantages over other deep reinforcement learning methods and random allocation method,and the average delay is about 59.9%of the DDPG algorithm and 20.4%of the A3C algorithm.(2)Joint SFC orchestration and Resource Allocation for Space-AirGround Integrated Networks via Deep Reinforcement Learning.In this part we consider a scenario of satellites,UAVs,and ground network.Where the SDN controller obtains the node information(geographical location,resource stock,etc.)in the network in real time.This scheme can efficiently deploy VNF and allocate the resources of nodes in the network,so as to serve the SFC requests of users and minimize the end-to-end delay of services.Similarly,we model the joint problem of SFC orchestration and resource allocation as a Markov decision process,and proposes a deep reinforcement learning algorithm based on td3 to solve this problem.The method starts with obtaining the network resources and node status at the current time,and then deploy the VNFs of SFC required by users on the nodes,and allocates the resources of each node,so as to minimize the overall end-to-end delay.The simulation results show that the TD3 algorithm in this paper still has great advantages over other algorithms. |