| Maze robot has attracted many scholars at home and abroad to study.Traditional maze robot navigation algorithms,such as flood filling algorithm or A* algorithm,have poor generalization ability,while reinforcement learning based maze robot navigation algorithms have improved generalization ability,but there is a problem of sparse rewards.The solutions to this problem include reward design and learning,experiential replay,balanced exploration and development,multi-tasking learning and auxiliary tasks,among which designing intrinsic reward is proved to be one of the effective methods to solve this problem.From the perspective of human and animal learning,curiosity,as an intrinsic motivation,plays an important role.Curiosity encourages people and animals to explore knowledge,learn about the unknown,and gain more information to solve current problems.In this paper,curiosity is introduced into the autonomous navigation algorithm of maze robots,and a series of related studies are carried out under the framework of reinforcement learning.The results are as follows.(1)A maze robot autonomous navigation calculation based on artificial curiosity is designed.The artificial curiosity algorithm is implemented based on Sigmoid function,which can output higher values for unknown.According to the curiosity value,the selection probability of each action is calculated to improve the learning efficiency by influencing the selection probability of reinforcement learning action.Secondly,a memory module is designed in the algorithm to store node Q value.After each learning,the information stored in the memory module is strengthened from back to front in the order of storage time.The algorithm is verified in the corridor maze,and it is proved that the maze robot can learn the shortest route from multiple paths based on artificial curiosity,and the learning time is lower than the traditional Q-learning algorithm.(2)Aiming at the problem of insufficient generalization ability of artificial curiosity designed based on Sigmoid function and poor algorithm solving ability in the face of more complex environment,a maze robot autonomous navigation calculation based on improved artificial curiosity is proposed,in which artificial curiosity is no longer simply described as a function,but designed as an improved artificial curiosity system inspired by ICM.It is composed of prediction network,associative memory network,inference algorithm based on associative memory network design and distance calibration algorithm.The output error of predictive network and associative memory network is used to represent curiosity.The larger the error is,the larger the curiosity value is,indicating that the agent is stranger to a certain action,that is,the agent is more interested in the unknown field,so as to promote the agent to explore the unknown.The designed inference algorithm is used to find the node with high curiosity.When the agent is trapped in the local minimum,the action sequence is generated to help the maze robot escape from the local minimum.The autonomous navigation algorithm of maze robot based on improved artificial curiosity can ensure that the agent does not repeat the exploration,so as to promote the learning of the agent.Relevant experiments are carried out in the complex maze environment,which proves that the algorithm has better learning ability and performance in the complex environment.(3)The research objects of the above two methods are corridor mazes.In order to make the intelligent body better adapt to the real environment,an autonomous navigation algorithm for maze robots with complex environment inference ability is proposed,which is composed of developmental associative memory network,node self-built system and reasoning algorithm.Among them,the node self-built system can help the agent to self-build the node according to the current information development pattern,and memory the node information through the designed developmental associative memory network.Curiosity is represented by the degree of curiosity designed for each node;When the curiosity of the agent is low in all directions in a certain area,the inference algorithm helps the robot to find the area with high curiosity and continue to explore and learn.The experimental results show that the agent can find the optimal path in an open maze map,and the learning time is shorter than that of the traditional DQN algorithm.This study combined curiosity and reinforcement learning to design three kinds of maze robot autonomous navigation algorithms.Compared with traditional maze robot autonomous navigation algorithms,they are more adaptable.The introduction of curiosity makes the robot exploration and learning process more efficient,providing new ideas for the maze robot autonomous navigation algorithm. |