| In human-machine hybrid intelligence systems,AI-enabled machine intelligence and human intelligence are integrated with each other and can surpass the decision making performance of individual human or machine in specific scenarios,which has become a current research hotspot.However,unlike traditional human-machine systems and AI algorithms,the decision making effect of human-machine hybrid intelligence systems is not only influenced by the performance of AI algorithms in the training phase,such as the generalization and robustness of the algorithms,but also by the hybrid approach of human and machine decision making in the execution phase,such as the allocation of human and machine control.How to optimize the decision-making performance of human-machine hybrid intelligent systems as a whole is an important research topic nowadays.This thesis is oriented to the problem of sequential decision making in humanmachine hybrid intelligent decision making system driven by deep reinforcement learning algorithm,while improving the robustness and safety of system decision making by introducing human intelligence from both the training side and execution side of the algorithm,and finally improving the decision making performance of human-machine hybrid intelligent system.The work in this thesis contains the following three main aspects:(1)For the sequential decision problem of human-machine shared control system driven by reinforcement learning algorithm,a human-in-the-loop reinforcement learning algorithm based on human policy constraints is proposed in the training phase to avoid dangerous behaviors of the machine while improving the sampling efficiency of the algorithm;an arbitration mechanism including human decision evaluation is proposed in the execution phase to discard the human wrong decisions and improve the overall performance of the system.Experimental results show that this method successfully improves the sampling efficiency of algorithm training and the success rate of the system in executing tasks.(2)For the sequential decision problem of human-machine traded control system driven by reinforcement learning algorithm in the multi-drone racing scenario,a reward function groups containing human feedback rewards is introduced in the training phase to guide the machine to understand the racing rules and reduce the number of human interventions in the execution phase;A two-level human intervention mechanism is introduced in the execution phase to avoid the appearance of rule-breaking or accident-prone behaviors,and to reduce the operational burden of human intervention.The experimental results show that this method shortens the lap time of the drone,improves the safety margin of the system decision,and reduces the burden of human intervention.(3)For the above human-machine hybrid decision-making method,this thesis builds a human-machine experimental platform from simulation to reality in the context of multirotor UAVs,proposes an overall process and framework for algorithm deployment to real physical scenarios,and conducts algorithm validation in realistic scenarios for the proposed reinforcement learning algorithm-driven human-machine traded control method in multi-drone racing scenarios. |