| Reinforcement learning has been widely applied in fields such as robotics control and has achieved remarkable performance.However,it is still plagued by problems such as large-scale space,low sample utilization,sparse and delayed rewards,and poor generalization.In multi-task scenarios,these problems become more severe and bring a new problem-catastrophic forgetting.Hierarchical reinforcement learning addresses various problems in reinforcement learning by decomposing complex tasks into multiple simple sub-tasks and abstracting a series of sub-policies.However,these sub-policies are learned under the guidance of externally defined reward functions for specific tasks,making them purpose-driven and difficult to utilize in other tasks,especially in multi-task scenarios.On the contrary,skill discovery algorithms use mutual information and entropy from information theory to generate internal reward functions by masking the task-specific external reward function,motivating agents to explore the entire space and generate diverse and task-agnostic sub-policies(skills)to meet different task requirements.This paper proposes a skill-discovery-based multi-task hierarchical reinforcement learning approach,which combines the advantages of hierarchical reinforcement learning and skill discovery algorithms to promote the improvement of agent performance in multi-task scenarios.The research mainly includes the following three parts:(1)Designing a skill-based multi-task hierarchical reinforcement learning framework.Using this framework,agents can automatically extract general skills containing relevant knowledge from the source environment to mitigate the forgetting problem in multi-task learning.Combined with hierarchical control,knowledge can be reused in downstream tasks,thereby reducing task complexity and improving sample utilization in multi-task learning.In addition,the skill library that stores skills can be extended with the increasing number of tasks to meet more and more task requirements.(2)Proposing an independent skill discovery algorithm in multi-task hierarchical reinforcement learning framework.First,the relationship between skill policy features and state and action is derived using a mathematical formula,and then the policies of each primitive skill are extracted using feature extraction algorithms.Next,cluster analysis is used to select a set of similar skills with similar functionality,and finally,an optimal strategy distillation algorithm is used to aggregate similar skills into a new independent skill that combines the strengths of each skill.This approach eliminates redundant parts of the skill library and improves the quality of skill policies,thereby achieving better performance in downstream tasks.(3)Proposing an adaptive skill discovery algorithm in multi-task hierarchical reinforcement learning framework.This algorithm encodes task information into inputs and combines meta-reinforcement learning ideas to train a gradient predictor shared by multiple trainers.Using the gradient outputted by the predictor,the skill policy is updated to adapt to the task requirements,thereby allowing unsupervised skills without reward function guidance to adapt quickly and effectively to downstream tasks.For each of the above methods,this paper verifies their feasibility through experimental analysis.Furthermore,the performance of the proposed methods is compared with that of other related algorithms,demonstrating their superiority in terms of performance. |