| Mobile edge computing(MEC)has become a promising solution to the lack of computation and battery resources for mobile devices by offloading computationally intensive tasks to be executed on edge computationally resource-rich servers,thus providing resilient resources for resource-intensive applications and reducing device energy consumption.The most critical issue in mobile edge computing technology is to determine the offloading strategy,i.e.,whether to offload tasks to edge servers for execution and how to allocate computing resources to edge servers,etc.While traditional offloading strategies for computing tasks are either inefficient or cannot dynamically adjust themselves to the dynamic changes of the edge environment,with the development of artificial intelligence technology,deep reinforcement learning has naturally become an important way to solve this problem.However,most existing computation task offloading strategies based on deep reinforcement learning consider only one of the continuous action space or discrete action space when setting actions,which inevitably leads to the loss of offloading strategy accuracy.In addition to this we also find that the existing computation offloading strategies are weakly adaptive when encountering new environments or when the environment changes,resulting in the offloading strategies having to restart sampling for learning,which affects the decision efficiency.To address the above issues,the main research work in this paper can be divided into the following two points:· We propose a computation offloading strategy based on reinforcement learning with hybrid action space.To address the problems of current deep reinforcement learning-based computation offloading strategies in action space settings,we introduce the concept of parametric action space,establish a computation offloading model based on it and formulate the edge computation offloading problem as a non-convex multi-objective optimization problem based on the established model,and finally propose a hybrid action space-based reinforcement learning algorithm:Hybrid-PPO to solve the optimization problem.· We propose a computation offloading strategy based on meta-reinforcement learning with hybrid action space.To address the shortcomings of the current deep reinforcement learning-based computation offloading strategy in terms of adaptability and to accommodate the pipelined processing of the task,we modify the computation offloading model and propose a hybrid action space-based metareinforcement learning algorithm after introducing meta-learning: Meta-HybridPPO to solve the non-convex multi-objective optimization problem built according to this computation model.In this paper,our two proposed computation offloading strategies are evaluated in simulation experiments and compared with existing computation offloading strategies.The experimental results show that our computation offloading strategy based on a hybrid action space for reinforcement learning achieves better results in reducing task processing latency and computation energy consumption than previous computation offloading strategies based on deep reinforcement learning,and our computation offloading strategy based on meta-reinforcement learning in the hybrid action space not only outperforms other algorithms in reducing task processing latency and computation energy consumption,but also has a significant improvement in adaptability. |