Font Size: a A A

Modeling And Simulation Of Human Group Behavior Based On Q-learning Behavior Tree

Posted on:2020-08-03Degree:MasterType:Thesis
Country:ChinaCandidate:Z Q DingFull Text:PDF
GTID:2428330575464569Subject:Control Science and Engineering
Abstract/Summary:PDF Full Text Request
With the development of the social economy and the changes in the international situation,the exchange of personnel around the world has become more frequent,and the safety of crowded places has become more and more prominent.In order to provide early warning of the abnormal behavior of people in crowded places and to formulate reasonable security measures and emergency response plans,it is necessary to study the behavioral mechanism of pedestrians and the behavioral characteristics of pedestrians.This research has important economic and social significance.Moreover,the small-scale group consisting of 2-5 pedestrians accounts for 70%of the population composition.Therefore,through the study of small-scale groups,the behavior of the crowd can be more accurately described.In addition to the research on behavior generation methods,crowd simulation requires pedestrians to make reasonable decisions under different circumstances.The existing agent decision model widely uses behavior trees to make decisions,but the design of behavior trees is complicated to debug and cannot be automated,the development efficiency is low and so on.In this context,the researcher's decision-making model and the generation method of group behavior.This thesis introduces multi-step Q-learning with self-learning mechanism to improve the behavior tree.In view of the shortcomings of multi-step Q-learning,this thesis uses simulated annealing strategy to improve the multi-step Q-learning action selection strategy and reduce the probability of non-optimal action selection.This thesis uses dynamic programming strategy to update the Q-value function in reverse order.Speed up convergence.Then the improved multi-step Q-learning algorithm is introduced into the behavior tree,and a behavior tree decision model based on improved multi-step Q-learning is proposed,which enables the agent to automatically adjust the behavior tree to generate appropriate behavior response.Next,the effect of n-value on the convergence speed of the algorithm in multi-step Q-learning is studied,and the optimal n-value is determined.The comparison between the proposed algorithm and the common Q-learning and SAQ algorithms proves that the convergence speed of the proposed algorithm is the fastest.Finally,the behavior tree of the police and abnormal actors is designed.The contrast experiments are carried out in the context of the events of important people inspecting the city.It is proved that the behavior tree of the automatic design is more reasonable than the manually constructed behavior tree.Aiming at the group behavior generation method of crowd,this thesis establishes a mathematical model of group formation based on linear interpolation method,and proposes a dynamic adjustment model of group formation.The model uses ray detection techniques to determine the formation of the group based on the size of the space.In this thesis,the two-level steering system is used to realize the movement of pedestrians in the virtual environment.The first-level system is a group intelligence.According to the improved PRM path-finding algorithm for global path planning,the node distribution of the PRM algorithm is optimized and the path is smoothed.The second-level system is a pedestrian agent that uses the A*algorithm for local path planning.Finally,a comparative experiment is designed to verify the correctness and authenticity of the proposed method.The experimental results show that the proposed decision model and the human group behavior generation method can effectively improve the crowd simulation effect,and have a good reference value for simulation-based public safety research.
Keywords/Search Tags:multi-step Q-learning, simulated annealing, dynamic programming, behavior tree model, linear interpolation, group behavior, Unity3D
PDF Full Text Request
Related items