Font Size: a A A

The Lateral Control Method Of Autonomous Vehicles With Deep Reinforcement Learning

Posted on:2023-06-29Degree:MasterType:Thesis
Country:ChinaCandidate:C YinFull Text:PDF
GTID:2532307097976729Subject:Mechanical engineering
Abstract/Summary:PDF Full Text Request
The vehicle control is a crucial part of autonomous driving technology.The linear model predictive control method is one of the best control methods at present because of its ability to deal with multi-objective optimization and variable-constraint problems.However,the approximation of model linearization and the quadratic-programming construction cause a loss of control accuracy.Since the vehicle is inherently a nonlinear system,it is difficult to accurately describe the vehicle dynamics with a simple model,which limits the further accuracy improvement of the model-based control methods.The model-free deep reinforcement learning method,which learns the system characteristics and the optimal control strategy through interacting with the environment continuously,is a potential data-driven control method but exists some problems such as low data utilization and numerous invalid explorations in the early training stage.This thesis designs a data-driven lateral controllers combining traditional optimization methods with deep reinforcement learning methods.Firstly,a multi-previewed points controller based on nonlinear model predictive control is proposed.The nonlinear kinematics model considering the actuator delay is used as the predictive model to reduce the nonlinear optimization scale.The lateral control problem is constructed as a nonlinear programming problem.The vehicle states calculated by the Stanley method in the prediction horizon sever as the iterative initial values to stabilize the computing time and avoid the local optimal solution.Besides,a dynamic compensation method is designed to improve the tracking accuracy in the condition of large lateral acceleration.Simulations are conducted to measure the multipreviewed points controller performance under various scenarios.Simulation results show that the multi-previewed points controller performs higher accuracy than th e linear model predictive controller.Secondly,this paper proposes a deep reinforcement learning framework based on the twin delayed deep deterministic policy gradient combined with the multi-previewed points controller.The simulation training environment is designed to generate low-cost training data efficiently.To solve the low sampling efficiency problem in the early training stage,an exploration strategy combining the expert strategy and the random strategy is proposed to increase high-reward samples.The reward function combines the qualitative goal achievement and the quantitative goal maximization to guide the agent to learn faster and optimize performance.Meanwhile,the action space is compressed to meet the actuator constraints and reduce the action shaking.In order to enhance the generalization ability,the state observation is constructed in the feedback form.Tracking various sine paths at different s peeds is set to the training scenario.The reasonable network structure and hyperparameters are selected to ensure stable training convergence.The reinforcement learning controller is verified in the simulation training environment.Simulation results show the exploration strategy combined with the expert strategy effectively improves the convergence rate.The reinforcement learning controller has a good generalization ability and performs better than the multipreviewed points controller in the condition of large lateral acceleration.Finally,the multi-previewed points controller and the reinforcement learning controller are tested in real environment.The experiment results show that the multipreviewed points controller has high tracking accuracy under small lateral acceleration.The tracking error is less than 0.1m and the computing time is less than 23 ms.The reinforcement learning controller exhibits good generalization ability.The tracking error is less than 0.15 m and the computing time is less than 1ms.Compared with the multi-previewed points controller,the reinforcement learning controller exhibits similar performances but has less computing consumption and higher efficiency.
Keywords/Search Tags:Vehicle lateral control, path tracking, nonlinear model predictive control, deep reinforcement learning, twin delayed deep deterministic policy Gradient
PDF Full Text Request
Related items