Font Size: a A A

Embodied Intelligent Visual Navigation Based On Entropy Estimation Energy Model

Posted on:2024-08-05Degree:DoctorType:Dissertation
Country:ChinaCandidate:N Y WangFull Text:PDF
GTID:1528307292497374Subject:Computer application technology
Abstract/Summary:PDF Full Text Request
Visual navigation is one of the highly sought-after research directions in the fields of computer vision and machine learning.Its core objective is to equip machines with human-like visual perception and navigation capabilities,enabling them to autonomously achieve precise localization and path planning in complex and unknown environments.With the continuous development of computer vision and deep learning technologies,research in visual navigation has become increasingly complex and diverse.Traditional visual navigation methods often rely on accurate maps and sensors,which are limited in their applicability to unknown or dynamic environments.Therefore,modern visual navigation research increasingly focuses on developing autonomous and robust methods that can adapt to various environmental conditions and achieve efficient navigation in the absence of prior information.Embodied intelligent visual navigation encompasses a series of complex processes,ranging from perceiving visual information in the environment and understanding the interactive relationships in the cognitive scene to making autonomous behavioral decisions based on this information.It is a crucial subject in the realization of modern technological applications such as autonomous mobile robots,self-driving cars,and augmented reality systems.In embodied intelligent visual navigation,there are several challenges within the’perception-interaction-cognition-action’ loop when faced with dynamic and complex environments.(1)Visual stability perception is fundamental to visual navigation,as it determines the quality of visual data used for navigation and directly impacts the navigation performance.In practical applications,navigation systems are often affected by lighting changes,occlusions,noise,and motion blur,resulting in unstable perception.(2)Understanding interactive relationships is another significant challenge in visual navigation.Navigation systems need to comprehend the objects,structures,and other agents in the environment to support accurate decision-making.Traditional methods often lack rich interaction information,such as relationships between target objects,motion trajectories,and scene semantics.(3)The challenge of autonomous behavioral decision-making involves machine systems choosing between different navigation options and making realtime decisions.Navigation systems must balance different navigation paths and action plans while facing challenges such as the choice between different navigation options and real-time decision-making.Traditional methods often lack sufficient information to weigh different navigation paths and action strategies.Regarding the challenges in embodied intelligent visual navigation mentioned above,this article introduces the concept of entropy modeling to address these issues.Specifically,it proposes a neighborhood entropy estimation energy model to solve entropy through modeling in the navigation process.The main contents are as follows:1.Visual stability perception is critical for embodied intelligent visual navigation.To address the challenges of unstable visual perception,detangling motion features in complex scenes,and correcting the shake trajectory of the visual sensing system,a visual navigation stability perception model based on cross-entropy optical flow attention is introduced.This model consists of a self-supervised contrastive learning Transformer network,sparse optical flow perception network,and multimodal cognitive fusion network.The model uses optical flow to estimate motion,and the sparse optical flow perception network senses partially sparse optical flow features as input for the self-supervised contrastive learning Transformer network,which generates sparse optical flow features.These features are then fed to the multimodal cognitive fusion network,which achieves stable perception for visual navigation.2.Cognitive interaction recognition is a vital component of embodied intelligent visual navigation.To address the complexity of social behaviors and the randomness of motion,a visual navigation interaction recognition model based on correlated entropy domain spatiotemporal graph convolution is introduced to model interaction relationships.The model includes a domain spatiotemporal graph convolution neural network and a gated dilated causal convolution neural network.The domain spatiotemporal graph convolution neural network creates a matching graph structure for each time step and calculates the weighted adjacency matrix of each graph structure using correlated entropy to obtain a sequence embedding representation of pedestrian interaction relationships.The gated dilated causal convolution neural network reduces linear superposition in the hidden layers,filters features with gate control to achieve interaction recognition for visual navigation.3.Reliable behavioral decision-making is crucial for embodied intelligent visual navigation.To address the challenge of multi-modal obstacle avoidance behavior decisionmaking during navigation,a visual navigation autonomous obstacle avoidance decision model based on causal entropy deep self-motion is introduced.The model consists of a cognitive generation network,a policy decision network,and a latent partition network to learn autonomous obstacle avoidance behavior decisions from expert policies.The model utilizes binocular vision to perceive scene depth,which serves as input to the cognitive generation network for obstacle avoidance strategy generation,aiming to maximize its causal entropy.Subsequently,the policy decision network optimizes the strategy generated by referencing expert policies.The generated obstacle avoidance strategy is simultaneously transmitted to the latent partition network to capture latent factors contained in expert policies and execute multi-modal obstacle avoidance.These three core networks iteratively optimize multi-modal strategies based on causal entropy and mutual information theory to achieve autonomous decision-making in visual navigation.
Keywords/Search Tags:Energy Model, Entropy Estimation Algorithm, Embodied Intelligence, Visual Navigation
PDF Full Text Request
Related items