Research On Autonomous Driving Technology Based On Inverse Reinforcement Learning

Posted on:2020-01-12

Degree:Master

Type:Thesis

Country:China

Candidate:K Liu

Full Text:PDF

GTID:2392330590474238

Subject:Control engineering

Abstract/Summary:

PDF Full Text Request

With the development of machine learning algorithms,autonomous driving technology continues to move forward,which will have a significant impact on future urban traffic.Decision making and control algorithms as its core module,including expert rule base and behavioral cloning,which have weak generalization ability and are not suitable for complex scenes.Reinforcement learning algorithm has exploration ability and can optimize a policy with better generalization.However,there are problem of high cost of exploration and hard to determine the reward function in the state of the art of reinforcement learning.In order to solve the above problem,this dissertation presents a modified policy optimization algorithm,utilizes the inverse reinforcement learning algorithm to learn the optimal reward function,and applies it to the autonomous driving decision task.For the problem of high cost of exploration in reinforcement learning decision making algorithm,this dissertation presents a deep deterministic policy gradient algorithm combined with expert supervised loss.The combined sampling mechanism is utilized to sample the training data from the expert demonstrations and the self-generated data.For the expert training data,the mean square error of the expert policy and the current policy is designed as the expert supervised loss,and the original policy gradient is combined to optimize the policy.For the self-generated training data,the policy are updated by original policy gradient.On the one hand,expert supervised loss function guides the policy to learn along the direction of the expert policy,on the other hand,it guides the agent to learn in self exploration.The policy learning speed,training process volatility and optimal policy are contrasted and analyzed in the open racing car simulator,the autonomous driving decision simulation examples are given to show the effectiveness of the proposed algorithm.To solve the problem that the reward function is difficult to construct empirically,the maximum entropy inverse reinforcement learning algorithm is adopted to learn the optimal reward function.By analyzing expert demonstration data,this dissertation extracts important state features,and constructs the reward function in a linear combination form.The probability model is established for the expert demonstration data based on the principle of maximum entropy,and the possibility of maximizing the emergence of the expert trajectory is taken as the optimization goal,the parameters of the reward function are iteratively optimized.Utilizing the learned reward function as the optimal reward function,we adopt the proposed policy optimization algorithm to optimize the policy.The policy learning speed,training process volatility,optimal policy and generalization ability are analyzed in detail,simulation examples demonstrate that the optimal reward function is effective.

Keywords/Search Tags:

inverse reinforcement learning, autonomous driving, expert demonstration data, expert supervised loss, reward function

PDF Full Text Request

Related items

1	Autonomous Driving Systems Design And Implementation Based On Deep Reinforcement Learning
2	Research On Vehicle Autonomous Following Decision-Making Via Deep Reinforcement Learning
3	Research On Autonomous Driving Decision Control Based On Deep Reinforcement Learning
4	Deep Reinforcement Learning Based Autonomous Driving Decision-Making Methods
5	Human-like Adaptive Cruise Control Algorithm Design Based On Deep Reinforcement Learning
6	Intelligent Control Of Autonomous Driving Based On Deep Reinforcement Learning
7	Research On Deep Reinforcement Learning Method For Autonomous Cooperative Reconnaissance Of UAV Swarm
8	Research On Intrinsic Reward Optimization Method Of Reinforcement Learning
9	Research On Imitation Learning And Its Applications In Autonomous Driving
10	Behavior Decision-making Of Intelligent Vehicle Based On Expert Driver Demonstration