Operational Tasks Learning Based On Multi-info And State Similarities

Posted on:2021-05-11

Degree:Master

Type:Thesis

Country:China

Candidate:X L Liu

Full Text:PDF

GTID:2428330605468059

Subject:Control Science and Engineering

Abstract/Summary:

PDF Full Text Request

In the home environment,robots with multiple operating skills can complete more complex housekeeping tasks and provide users with a better service experience.Existing skills learning methods require a large amount of training data and require high hardware.In particular,the skills already learned cannot be used to accelerate the training of new skills.It needs much training time and is difficult to converge.In order to allow robots to quickly learn different skills in the environment,there are two key issues,one is how to efficiently and concisely represent complex environmental states,and the other is how to achieve knowledge sharing between different operational tasks.To deal with these problems,this paper proposes a fast skill learning algorithm for different operation tasks.The advanced features of the states extracted by the Multi-Info network are used as input data of the skill strategy network.Sharing similar state data achieves the purpose of knowledge sharing,and uses an improved policy gradient reinforcement learning algorithm to achieve fast skill learning of operation tasks.Specific research contents and innovations include:Firstly,network structures of two state representation learning models are designed.For different operation tasks,the state data set is firstly collected in the environment using a randomly initialized skill strategy,and then the state representation learning model based on the Auto-Encoder method is used to extract the common and static features,followed by training the state representation learning model that satisfies five types of prior-knowledge to extract advanced dynamic features of the state.Finally these features can accelerate the training process of the skill strategy model,laying a foundation for rapid skill learning.Secondly,this paper proposes a mixed strategy of multi-feature weights with contextual dynamic information.It combines the static and dynamic features of the two state representation learning models.First,batch standardization of feature data of different models within a fixed time window is performed to obtain uniformly distributed feature data,and then add the reward value obtained after the robot takesthe action in the environment as a paranoid term in the mixed feature,which can increase the weight of the state feature with a high reward value and reduce the weight of the state feature with a low reward value,and finally obtain the mixed feature It can effectively solve the problem of weight between different model features and retain the context dynamic information of the features,making the model training process of skill strategies more rapid and stable.Thirdly,this paper proposes a sample similarity calculation method for efficient knowledge sharing between skill strategies of different operation tasks.The improved policy gradient reinforcement learning algorithm is used to train the skill learning model.First,the manipulator strategy model is trained until converge in one operational task environment.For the training of strategy models for new operating tasks,using features with context information in a fixed time window as the calculation vector of similarity scores.Then add similar sample data generated by the old strategy to the training of the new strategy model to achieve the purpose of knowledge sharing.In summary,the improved policy gradient algorithm based on multi-features and state similarity proposed in this paper solves the problems of efficient feature extraction of environmental states and efficient knowledge sharing between skill strategies of different tasks.It can not only achieve a single operational tasks,such as the skill learning of clicking the static button and the dynamic movement button,and can achieve knowledge sharing between the skill strategies of two different operational tasks.The results of simulation experiments show that the method proposed in this paper can not only accelerate the training of skill strategies for operational tasks,but also improve the overall performance of strategies.Compared with the basic algorithm,it achieves the highest task success rate and average reward value.

Keywords/Search Tags:

operational tasks, skilling learning, knowledge sharing, feature fusion, reinforcement learning

PDF Full Text Request

Related items

1	Research On Knowledge Sharing And Exploration Mechanism In Multi-agent Reinforcement Learning
2	Knowledge Sharing For Multi-agent Reinforcement Learning Via Teacher-student Paradigm
3	Research On Picking Tasks Method Of Nuclear Robot Based On Deep Reinforcement Learning
4	Research On Knowledge Graph Construction Technology Based On Semi-Supervised Learning
5	Implementation Of Task Structure Utilization In Four Machine Learning Tasks
6	Research On Applying Deep Reinforcement Learning In Image Based Control And Image Classification Tasks
7	Research On Key Technologies Of Knowledge Graph Reasoning Based On Deep Reinforcement Learning
8	Research On Reinforcement Learning Methods Towards Unfixed Tasks And Non-static Environments
9	Accelerating Reinforcement Learning Training In Robotic Tasks
10	Supervised Reinforcement Learning:methods And Applications