Font Size: a A A

Research On UAV Path Planning Based On Reinforcement Learning Under Mobile Edge Computing Architecture

Posted on:2021-09-25Degree:MasterType:Thesis
Country:ChinaCandidate:Q LiuFull Text:PDF
GTID:2512306512486524Subject:Communication and Information System
Abstract/Summary:PDF Full Text Request
With the rapid increase in the number of smart devices,the computing requests from the terminal users(TUs)exceed the capability of their devices.Mobile edge computing(MEC)architecture places the servers at the edge in order to provide the TUs with timely and effective computational service.However,the cost of building the MEC structure is expensive in remote or disaster areas.The unmanned aerial vehicle(UAV)with flexibility and no infrastructure can assist the MEC system.In this context,this article focuses on the UAV-mounted MEC structure to provide computational service for the TUs.However,how to optimize the UAV trajectory under the constraint of the limited battery is the current research hot spot.This article explores and analyzes on this issue,and achieves the following results.(1)Consider the situation with static TUs and discrete UAV trajectory,deploy the UAV-mounted MEC structure,and design the UAV trajectory to maximize the TUs’ offloaded tasks.The Markov decision process(MDP)is used to model the maximization problem.The value-based reinforcement learning(RL)algorithm is designed to find the optimal policy.Simulation results show that the designed algorithm can effectively choose the approximate trajectory for the UAV to provide high-quality computational service for the TUs.(2)Consider the situation with dynamic TUs and discrete UAV trajectory,and design the UAV trajectory to maximize the system utility.The movement of each TU is modeled by the Gauss-Markov random model.Under the quality-of-service(Qo S)constraints of TUs,the value-based RL algorithm with deep neural network(DNN)is proposed.Simulation results show that the proposed algorithm can design the UAV trajectory over dynamic TUs to guarantee the Qo S of TUs and achieves higher system utility than traditional RL algorithm.(3)Consider the situation with dynamic TUs and continuous UAV trajectory,and optimize the UAV trajectory to maximize the system utility.The optimization problem is formulated as an MDP with continuous action space.The gradient-based RL algorithm with DNN is developed.Simulation results show that the developed algorithm can effectively plan the UAV trajectory with higher system utility than the normal gradient-based RL algorithm.Finally,the research results and some deficiencies of this paper are summarized,and the further research directions are discussed.
Keywords/Search Tags:mobile edge computing(MEC), reinforcement learning(RL), unmanned aerial vehicle(UAV), path planning, Markov decision process(MDP)
PDF Full Text Request
Related items