Font Size: a A A

Domain Knowledge And Physical-Enhanced Reinforcement Learning For Intelligent Vehicles Decision-Making And Control

Posted on:2022-12-28Degree:DoctorType:Dissertation
Country:ChinaCandidate:Y X ZhangFull Text:PDF
GTID:1482306758477374Subject:Vehicle Engineering
Abstract/Summary:PDF Full Text Request
Realizing fully autonomous driving is an essential strategic task and the ultimate goal of the intelligent vehicles technology innovation society.Currently,the research of intelligent vehicles in the automotive industry mainly adopts rule-based or model-based methods to develop algorithms through experimental calibration and scenario verification,which has the problems such as long calibration and verification time,difficulty in meeting the needs of the development cycle of intelligent vehicles,high R&D costs,difficulty in covering all actual working conditions,et al.Therefore,it can be seen that the L4-L5 level intelligent vehicles need self-learning and self-adaptive ability when facing complex and variable driving scenarios that are difficult to be verified in the development stage.Intelligent decision and control methods incorporating intelligent learning technologies are the core technologies to achieve the adaptive performance above.Currently,the main challenges are rational decision-making and safe control of autonomous vehicles in complex and dynamic scenarios.For the reason that the abstract property of the decision-making will cause great difficulty in modeling and the strong nonlinear and safety-critical characteristics of the motion control,it is rather difficult for the control system to make adaptive decisions in complex scenarios and to ensure the safety and adaptive control performance under changing working conditions.As a typical interactive machine learning technique,reinforcement learning can directly learn the system control through feedback signals from the environment.Thus,RL has been considered as an effective method to achieve intelligent self-learning capability in decision-making and control to enhance system performance.However,existing reinforcement learning algorithms do not yet have performance advantages in terms of learning efficiency and stability,and safety in the learning process.In this context,this paper aims to improve the learning efficiency and ensure the learning safety,and conducts the following research on how to effectively use the domain and physical knowledge of the system to enhance the performance of reinforcement learning algorithms for the application of methods for adaptive decision-making and control problems of intelligent vehicles in dynamic scenarios.First,a policy-promotion-oriented reinforcement learning algorithm combining direction evaluation and model-free search guide is proposed for the adaptive learning needs of vehicle longitudinal driving strategies in dynamic scenarios.The online-learning environment matching parameters of the real-vehicle platform is established in Carsim.Under the analysis of the policy gradient algorithm and the problem with domain knowledge,the direction of action exploration is evaluated and guided,which enables the policy to be promoted in each updating step.Therefore,the efficiency and stability of the algorithm are improved.By establishing an efficient online-learning system to replace current rule-based or model-based methods,the difficulty that requires calibration of control parameters or precise modeling for all potential working conditions in longitudinal vehicle control is overcome.The system control is realized to be adaptively learned with dynamic scenarios.The simulation and real-vehicle experimental results effectively verify that the method has generalization performance for untrained scenarios,adaptive learning performance for dynamic scenarios,and real-time online application performance in real-vehicle platforms.Secondly,to ensure the requirement of state-variables constraints is always satisfied during the whole lateral motion learning control period under the system uncertainty caused by the change of scenario conditions,the Barrier Lyapunov Function-based safe reinforcement learning algorithm is proposed.The hierarchical system learning architecture is established using the optimized backstepping design.The overall system control is designed with the Barrier Lyapunov Function to consider system state-variables constraints.The virtual control in each subsystem is adaptively optimized with the derived updating equations.The guarantee of vehicle position state-variables constraint in the learning process effectively limits the range of vehicle position state-variables with the uncertainty in the model parameters.Thus,it solves the problem of vehicle lateral motion control adaptability to have consistent safety control performance in changing scenario conditions and has the adaptive learning performance for model parameters change with scenario conditions.The ultimate convergence of state-variables considering safety constraints and the optimization-based learning part are demonstrated by the Lyapunov method.Then,to further consider full state-variables constraints for vehicle lateral and longitudinal motion learning control under dynamic scenarios,the adaptive safe reinforcement learning method is proposed in order to solve the difficulty that safety performance cannot always be satisfied during the learning period.Based on the Barrier Lyapunov Function-based safe reinforcement learning algorithm,the method introduces the asymmetric barrier Lyapunov function to consider the state-variables constraint requirements in the asymmetric form.Based on the Lyapunov stability analysis,the conflict between safety performance and optimization performance is established as an inequality constraint in the learning updating.The constrained adaptive algorithm is designed to ensure that all state-variables always converge consistently and within the safety constraint region.Therefore,when the change of scenario conditions causes the model uncertainty,the lateral and longitudinal motion control can effectively restrict the vehicle position and velocity state-variables within the constraint region,which better solves the dynamic scenario condition adaptation problem and avoids the system state-variables entering the non-safe region when the model uncertainty leads to motion instability and control failure.Finally,considering the adaptive and interactive decision-making problem in complex scenarios,a parameter-based decision-making method is constructed,effectively and adaptively defining the decision-making problems in different scenarios.The mixed action space contains three decision parameters: lateral offset,acceleration,and action duration.Then,optimization embedded reinforcement learning is proposed,which embeds the model-based optimization method in reinforcement learning.The characteristic data of the lower-level controller is sampled to train the neural network model,and the direct search algorithm use this neural network to explore continuous behaviors in action space directly.The complexity of the problem at the decision-making level is reduced,thus improving the efficiency of the learning algorithm based on the mixed decision space.The interactive decision-making in the changing scenarios is realized through the online solution capability of the optimization algorithm.This paper researches the decision-making and control methods for intelligent vehicles and designs self-learning control algorithms using domain and physical knowledge to enhance performance.The designed strategy in action exploration and policy iteration equation improves algorithm efficiency and ensures safety performance during the learning period,which solves the adaptive problem of intelligent vehicles in dynamic scenarios and changing working conditions.The effectiveness of the proposed method is verified through simulation and real-vehicle experiments.This research is of great significance and role for future L4-L5 intelligent vehicles to realize the adaptive iteration of driving strategy in the user-side software,as well as meet the requirements of the development cycle of intelligent vehicle control algorithms and the application scenarios.
Keywords/Search Tags:Vehicle Control and Intelligence, Reinforcement Learning, Decision-making and Trajectory planning, Optimized Backstepping
PDF Full Text Request
Related items