| The behavior decision-making module of autonomous vehicles is an important element for achieving safe and autonomous control of autonomous vehicles.The classical end-to-end autonomous driving decision-making method has limitations such as difficulty in entity migration and dependence on human driving data.In recent years,Reinforcement learning has demonstrated higher autonomous exploration and decision-making capabilities in the auto drive system.In view of this,this article proposes an autonomous driving roundabout traffic decision-making optimization model that integrates curiosity distillation,based on the complex interaction situation and high accident rate of unsignalized roundabout intersections,to achieve traffic control of autonomous vehicles at roundabout intersections.The main research content of the thesis is as follows:(1)In view of the complex vehicle interaction behaviors such as merging and exiting of traffic vehicles at roundabout intersections,this thesis proposes an automatic driving Decision model for roundabout traffic.Firstly,Markov modeling is performed on the roundabout scene;Secondly,the basic model of Reinforcement learning is improved and optimized in two aspects.On the one hand,based on the different focus of tasks for autonomous vehicles entering and exiting the roundabout,the state space is divided into two parts: environment representation and task representation,which serve as inputs to the Actor network and can ensure traffic efficiency based on decision output.On the other hand,to solve the problem of decision bias caused by overestimation of Q value due to greedy strategy in the basic model of Reinforcement learning,a two-layer strategy Critical network is used to improve and optimize.Secondly,the priority experience playback model based on Huber Loss function is introduced to improve the utilization of model training data while ensuring the robustness of Outlier processing.Finally,the model was trained and tested using the dual lane human roundabout driving dataset ACFR FIVE Roundabout.According to the experimental results,this model can make safe and reasonable decisions in the scenario of a two-lane roundabout.(2)Aiming at the singularity and subjectivity of the external reward function set manually in the Markov model,this thesis introduces the curiosity distillation module into the automatic driving Decision model for island roundabout traffic,integrates the internal curiosity model and the random distillation model,and proposes an automatic driving island roundabout traffic decision-making optimization model that integrates the curiosity distillation,providing an embedded internal reward mechanism for automatic driving vehicles,And together with artificially set reward functions,a complete reward mechanism for the model is formed,which breaks the subjectivity of reward settings and affects the decision-making behavior of the model,improves the rationality of rewards,and further enhances the model’s ability to explore the environment.(3)In order to verify the generalization ability of the model,this article uses an autonomous driving simulation platform to construct an autonomous driving decision simulation testing system in the roundabout scene.Set up the simulation model of single lane roundabout in the "Ministry of Transport Identified Automated Driving Closed Site Test Base" of Chang’an University,and select the three lane roundabout simulation scenario built into the simulation platform to conduct adaptability analysis of the model.From the experimental results,it can be seen that the model proposed in this article can achieve safe decision-making behavior in different roundabout scenarios. |