Font Size: a A A

Deep Reinforcement Learning Based End-to-end Visual Servo Control For Multi-rotor Unmanned Aerial Vehicles

Posted on:2019-01-05Degree:MasterType:Thesis
Country:ChinaCandidate:Y B XuFull Text:PDF
GTID:2392330611993348Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
UAV swarm has great application prospects in the military and civilian fields,which attracts more and more researchers to study its key technologies.There are a large number of UAVs in a swarm,so the management of the swarm is an important research content,including autonomous charging,autonomous recovery and so on.The key technology of autonomous landing is involved in these occasions.Therefore,this thesis studies the application of deep reinforcement learning in UAV auto-landing.Under the framework of deep reinforcement learning,the image is used as the input of the model to achieve an end-to-end auto-landing servo control method,which helps to improve the intelligence of the UAV auto-landing.The main research contents of this thesis are as follows:(1)Use the value-based Q-learning algorithm based on deep reinforcement learning to solve the auto-landing problem of UAV.First,the landing problem of UAV is described as a Markov decision process in the thesis.The down-looking image of UAV is regarded as its own state,and the position information in the Gazebo simulation environment is used to construct the reward function.Through the interaction between UAV and Gazebo,the deep reinforcement learning neural network is trained to realize the end-to-end control of UAV.Both the original Q-learning algorithm and 3DQN algorithm are used in the training process.In addition,the method of accelerating model convergence by using external controller and database pre-training is tried,and some accelerating effect is obtained.(2)Apply DDPG algorithm based on AC framework to multi-rotor UAV autonomous landing problem.In order to change the controlling quantity of the UAV from the discrete value to the continuous value in the deep reinforcement learning algorithm,and also to improve the convergence speed of the algorithm,this paper applies a strategy-based deep reinforcement learning algorithm DDPG to UAV autonomous landing in Gazebo.It can be found that the convergence speed of this algorithm is faster than that of the 3DQN algorithm by monitoring the indicators in the training process.However,it will diverge after convergence for a period of time,and the evaluation index such as the reward value and the loss value in evaluation network will fluctuate drastically.Therefore,the model may have over-fitting in the previous training,so the structure of the depth network or parameter initialization method is also needs improvement.When testing the model,the flight trajectory is indeed smoother than that in discrete motion control and theUAVcan land near the center point of the landmark.(3)The end-to-end control network model of deep reinforcement learning based on database training was tested on the physical platform of the UAV.The deep neural network model is based on the deep reinforcement learning algorithm 3DQN and pre-trained by the database.With the model placed on the physical platform of the UAV,the landing test is successfully,which is a good attempt for the deep reinforcement learning algorithm applied to the physical platform of the UAV in real environment.The test results show that the database pre-training model has certain adaptability in the real flight scenario,and can land the UAV to the vicinity of the ground marker center,which reflects the effectiveness of the end-to-end control based on deep reinforcement learning.
Keywords/Search Tags:Deep learning, Reinforcement learning, End-to-end control, Autonomous landing, Unmmaned Aerial Vehicle(UAV)
PDF Full Text Request
Related items