Font Size: a A A

Intelligent Control Of Autonomous Driving Based On Deep Reinforcement Learning

Posted on:2019-06-12Degree:MasterType:Thesis
Country:ChinaCandidate:S X ZuoFull Text:PDF
GTID:2382330566998505Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
As the development of Artificial Intelligence(AI),more and more intelligent devices are potentially changing our lives.Autonomous driving cars are very promising for the future transportation,and the decision making and controlling for autonomous driving cars are very important problems.Imitation learning and reinforcement learning(RL)are both algorithms which can teach the agent learn how to make decisions and generate appropriate policies.In this paper,we select two typical algorithms,Dataset Aggregation(DAgger)and Deep Deterministic Policy Gradient(DDPG),and analyze their strengths and weaknesses.We find that although DAgger can find policies rapidly,the policies ' quality is seriously limited by the demonstrator's policy.While for DDPG which doesn't need a demonstrator,the training is greatly depend on the definition of reward functions.Hence,in this paper,we try to implement RL methods to solve the above question,and propose a new algorithm of RL,which can improve the training quality by learning form demonstrations.In this paper,we introduce supervision of demonstration to the original DDPG,and propose a new RL method which we refer to DDPG with Demonst ration(DDPGw D).The algorithm is based on actor-critic framework,and we design a new cost function for the training of critic network.The newly designed cost function is the weighted sum of TD-loss and the mean squared error of Q values generated by demonstrator's action and the current policy's action.A margin value is used to improve the effectiveness of supervision.We describe the parameter updating function of the critic network under the newly designed cost function in detail.We propose an integrated experience replay method to reduce the fluctuation when training the original DDPG.The thought is to always include a part of transitions with good behavior when sampling training data.For the beginning episodes which usually don't have enough good transitions,we instead include the best transition in each episode.Combined with the supervision cost function proposed above,we describe the training process of DDPGw D algorithm.In this paper,we select a common used simulator named TORCS to verify th e effectiveness of the proposed algorithm,and the simulation results validate the practicability of DDPGw D in the area of autunomous-driving.
Keywords/Search Tags:intelligent decision making for autonomous driving, DDPG, DDPG with demonstration, imitation learning, reinforcement learning
PDF Full Text Request
Related items