Font Size: a A A

Transfer Of Action Sequence Prediction In Reinforcement Learning

Posted on:2019-08-18Degree:MasterType:Thesis
Country:ChinaCandidate:X J DingFull Text:PDF
GTID:2518306347957789Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
The application of artificial intelligence,such as automatic drive and service robot,needs an ability to decide in uncertain dynamic environments.This is what reinforce-ment learning concerned,an approach learning from the interaction with environments.Reinforcement learning is an approach of trial and error,so it needs many samples and is hardly applied in reality.Transfer learning utilizes the similarity between tasks,which is a framework of abstracting useful knowledge from learned tasks and accel-erating the learning speed in an unseen new task.Transfer learning in reinforcement learning has many different types,such as the transfer of parameters,sub-policies,rep-resentation and so on,but most methods are applicable to the transfer between same state-action space tasks.The abstract of high level knowledge and proposal of flexible methods have yet to be invented.In the transfer of reinforcement learning tasks,features of state space are often considered,but features and structures of actio space are less concerned.The state s-pace of reinforcement learning tasks has many different types,such as small state space of one-hot encoded location and large state space of pixels in video games.Although the representation and understanding of state space is a hard problem,the understand-ing of action space is much simpler.For example,in a navigation task,it only has four actions,which are east,west,north and south.And different combinations of action sequences include knowledge learning from environments.Hierarchical reinforcement learning divided action space into high level and low level control,which greatly re-duces complexity of problem.The understanding of action space structure,various combination of action sequences and research of relationship between actions are a kind of abstract semantic knowledge.This thesis considers action space and propos-es a new transfer knowledge called action pattern,which is formally described as the probability distribution of next action based on the history of action sequences.The method of this thesis includes the abstract of action pattern and the transfer of action pattern.In the abstract of action pattern,we propose an action sequential pre-dictive model,which is actually a recurrent neural network(RNN).RNN is often used to model sequential data and can build long term relationship.In theory,it is proven Turning-complete.The model can not only discover relation of actions and modes of behaviour,but also it can generate similar action sequences with source tasks.In the transfer of action pattern,two transfer algorithms are proposed,which are intrinsic re-ward based transfer(IR)and heuristic exploration based transfer(HE).IR utilizes the importance of reward in reinforcement learning and regards predictions as an evalua-tion of current action,which helps to solve the problem of sparse reward.However,HE directly samples an action from predictive model and uses it in the strategy of explo-ration.This method is more robust than IR method.To verify the validity and flexibility of these transfer methods,we experiment in both same and different state space trans-fer settings.The results show the transfer of action pattern incredibly improves the learning speed in target tasks.Finally,this thesis extends this method to the area of deep reinforcement learning and proposes an approach of transferring knowledge from small state space tasks to large state space tasks,which helps to accelerate solving of complex problems.Also we implement the DQN based action pattern transfer and test the performance in shooting game of ViZDoom platform.
Keywords/Search Tags:reinforcement learning, transfer of reinforcement learning, sequence prediction, action sequence, recurrent neural network
PDF Full Text Request
Related items