| Intelligent navigation is a fundamental requirement for various autonomous locomo-tion devices,including robots,and plays a crucial role in enabling intelligent agents to explore unknown environments.However,navigation technology still faces challenges in real-world environments,including but not limited to: uncertainty in policy decision due to sensor and controller noise,performance degradation caused by complex weather and road conditions,decreased computational efficiency during prolonged operation in large-scale scenarios,and difficulty in training models to generalize well to unknown scenarios given the inability to cover all weather conditions and unexpected situations with train-ing data.These challenges significantly impact the design and implementation of deep learning and reinforcement learning models for navigation tasks.In response to these challenges,this thesis draws inspiration from the visual and spa-tial cognitive mechanisms of mammalian brains.Guided by the neural pathways from the visual cortex to the hippocampus,and using the discoveries of cell functionality in the brain and theoretical models of cognitive mechanisms in neuroscience as theoretical references,relevant computational models are designed and applied to various tasks in navigation systems.The main research contents and contributions of the thesis are as follows.First,the thesis investigates the problem of visual odometry from visual perception to visual localization.Current visual odometry research has focused on texture-rich in-door environments and simple traffic scenarios,neglecting the ability to generalize across a wide range of sensor noise and complex scenarios.Inspired by visual attention mecha-nisms,a position-aware optical flow estimation network is designed.Then,by analyzing the optimization process of bundle adjustment and deep learning,geometric measurement-based bundle adjustment is improved.Finally,the improved bundle adjustment is com-bined with optical flow and depth estimation networks for end-to-end unsupervised train-ing of pose estimation.Experimental results demonstrate that the proposed optical flow estimation and pose estimation methods not only improve performance in outdoor scenes but also excel in sensor noise,low frames-per-second environments,and cross-dataset transfer evaluations.Second,the thesis explores grid representation and its role in loop closure detection tasks.Current vision-based closed-loop detection techniques are limited by problems such as long computational time consuming and excessive influence from view angle changes and dynamic objects.Inspired by the spatial characteristics and representation stability of mammalian grid cells,a pose-based loop closure detection model is designed.By con-straining model features with positional and head orientation cells,the network sponta-neously learns grid-like feature representations.Experimental results show that the grid-like feature representation not only effectively enhances model generalization but also significantly contributes to loop closure detection tasks through interpretable analysis.Additionally,a loop closure detection algorithm that fuses pose and visual information is designed.Experimental results indicate that this algorithm significantly improves loop closure detection performance while substantially reducing computational costs.Finally,the thesis investigates a point goal navigation model without the assistance of a localization system.Most of the existing research assumes that the robot’s movements are noise-free and have accurate Global Positioning System(GPS)sensors for localiza-tion.In real indoor environments,both the robot’s acquisition system and control system are affected by noise and obtaining accurate GPS signals is difficult to achieve in most en-vironment Inspired by the path integration and landmark calibration mechanisms in mam-malian navigation,a point goal navigation model based on unsupervised visual-motion calibration is designed.To address the uncertainty of robot actions and GPS localization in indoor navigation tasks,the performance of unsupervised visual odometry pose esti-mation is enhanced by introducing richer visual features.Then,collision information,path integration of self-motion,and pose estimation of visual odometry are combined to provide the policy network with action policy decision in the current environment.By employing reinforcement learning,visual localization is used for unsupervised calibration of self-motion path integration during training.Experimental results show that the pro-posed algorithm not only achieves point goal navigation capability in noisy environments without GPS but also surpasses the performance of some supervised learning models in navigation performance.In summary,this thesis focuses on the field of visual navigation,exploring research in visual perception,spatial localization,and policy decision.In the research across the three components,visual perception enhances the system’s understanding of the environment,providing crucial input information for subsequent spatial localization and action decision.Spatial localization is built upon an accurate understanding of the environment,estimating the robot’s position in the environment and offering spatial reference for the navigation system.Policy decisions are made in navigation based on the agent’s understanding of the environment and spatial awareness of its own position,representing the final link in autonomous navigation.By leveraging various visual and spatial cognitive mechanisms found in the brain,the thesis addresses challenges in critical tasks of navigation systems.This work hope to inspire new research approaches for intelligent navigation systems with reference to biological navigation mechanisms. |