| Unmanned Aerial Vehicle(UAV)is widely used in various research fields because of its convenience and maneuverability.In the urban environment,drones are usually used to perform tasks that are difficult for humans to complete,such as search and rescue.In the urban environment,there are usually various static obstacles such as various office buildings and street lights,as well as dynamic obstacles such as birds and other drones performing tasks.In this environment,the UAV needs to reasonably plan its own three-dimensional path to complete obstacle avoidance.However,in practical applications,due to the complexity and variability of the environment,the UAV does not have prior knowledge about the location of obstacles,especially the trajectory of dynamic obstacles.The typical UAV path planning algorithm cannot complete the real-time path planning problem.Therefore,this thesis studies the multi-dimensional path planning problem of UAV in urban environment,which is used to avoid obstacles without knowing the distribution of obstacles in advance.At the same time,the problem of insufficient energy consumption of UAV in the actual environment is also considered.By introducing multiple wireless charging stations,the UAV is recharged through the Light of Sight(LOS)link.This thesis introduces the obstacle shape,UAV flight path,flight energy consumption and charging model respectively.The optimization problem of minimizing the UAV flight path is proposed and modeled as a Markov decision process.Considering the use of deep reinforcement learning algorithm to solve the problem,the path planning problem is completed by optimizing the action space including UAV speed,yaw angle and top view angle,while ensuring that the UAV energy is not exhausted.In the dynamic environment,this thesis also discusses the modeling process of dynamic obstacles,which is analyzed by two random motion models.At the same time,considering the more complex situation of solving the target problem in the dynamic environment,this thesis starts with the optimization experience playback and experience storage mechanism,studies and improves the traditional DQN algorithm,and proposes a double priority experience playback DQN algorithm based on sample optimization.The mechanism of sample optimization is introduced to improve the convergence speed of the algorithm.In addition,a hybrid DQN UAV local intelligent path planning algorithm is proposed to complete the local UAV path planning under the known static optimal path.Finally,the simulation is carried out from the static environment and the dynamic environment respectively.The simulation results show that in the complex urban environment,the UAV can complete the obstacle avoidance task well through the deep reinforcement learning algorithm,reasonably plan its own path to minimize the flight path length,and ensure to meet its own energy constraints by harvesting energy from the wireless charging station.In addition,the proposed two algorithms are compared with the traditional DQN algorithm.The simulation results show that the proposed algorithm can converge faster,and achieve shorter flight path and less energy consumption. |