| Reinforcement learning has been proven to work on scenarios with well-designed reward functions and easily available interactions with the environment.However,in some real-world applications,designing a reward function is difficult,and the environment may be related to specific hardware devices.Frequent interaction between the agent and the environment can lead to high costs and even catastrophic failures,especially in high-risk scenarios.To overcome these challenges,human-in-the-loop reinforcement learning algorithms enable humans to guide the agent’s strategies during the training process by providing additional reward functions and task knowledge.This reduces the interaction frequency between the agent and the environment to ensures safety.In this thesis,we studies human-in-theloop reinforcement learning algorithms and its applications.The main contents are as follow:(1)Based on the model-free shared autonomy system framework,this thesis proposes a shared autonomy system based on intervention reward.This system uses the reward shaping method in human-in-the-loop reinforcement learning algorithms to design the intervention reward.The intelligent agent learns to process human inputs through intervention rewards,enabling it to improve task rewards while maintaining human control.This thesis also proposes two intervention reward function optimization methods based on time and action relevance.The performance of the system is experimentally evaluated in the LunarLander scenario,and the results show that the proposed shared autonomy system outperforms existing systems in terms of collaborative performance.(2)To enable non-experts in various fields to apply reinforcement learning algorithms,this thesis designs and implements a HITL-AI platform for managing human-in-the-loop reinforcement learning agents.The main modules are designed according to user requirements,and the functions of each module are divided.The platform’s front-end and backend are built using the Vue framework and SpringBoot framework.The front-end is designed to allow users to participate in the creation,training,and deployment process of reinforcement learning agents.The back-end establishes connections between various modules in the form of services,and a MySQL database is used to build the persistence layer to store models and data.The platform’s various functional modules are presented through the front-end and back-end interaction process.(3)To verify the usability and adaptability of the HITL-AI platform,the platform’s functions are tested and verified in the classic reinforcement learning scenario GridWorld and two real-world application scenarios:network scenario and unmanned aerial competition scenario.The usability and scalability of the platform are demonstrated from the perspectives of application process design and application effect display,respectively,based on the different scenarios.This provides some ideas for future research based on the platform. |