| Three-dimensional semantic information is an important factor for intelligent machines to understand the world and an important part of artificial intelligence.Therefore,computer intelligence is required not only to understand the basic shape of real-world objects,but also to understand the semantic information content.Sensors such as lidar and visual cameras have become the "eyes" of intelligent machines,which can capture environmental information and analyze it using computers as "minds" to complete the cognition of environmental information.Among them,visual data such as images contain the most abundant environmental information,and how to help intelligent machines to understand and understand the surrounding environment like humans through these visual data has become a basic problem that must be solved in the field of computer vision technology and artificial intelligence applications.The main research of this paper is as follows:Firstly,the basic principle and architecture of visual SLAM are discussed,the current algorithm classification of monocular visual SLAM is expounded,and an improved SLAM algorithm framework based on ORB-SLAM2 is proposed.The algorithm combines the advantages of the intuitive method and the feature point method,which can ensure the fast characteristics of the direct method and the high precision and closed-loop capability of the feature-based method,and can deal with problems such as low texture and perceptual aliasing in complex environments.The establishment and optimization of the 3D semantic model of the indoor scene can be realized,and the composition and functional characteristics of the algorithm module are analyzed in detail.Secondly,in the research of 2D image semantic segmentation,the semantic segmentation algorithms of convolutional networks are compared and analyzed;DeepLab V3+ semantic segmentation algorithm is selected as the 2D semantic segmentation model in this paper.The network model has high efficiency and high performance in semantic segmentation.The advantage of high precision.In the 3D semantic modeling experiment,the model task is to perform pixel-level semantic annotation on each key frame,so as to add semantic information to the 3D space.The improved SLAM algorithm in the dynamic environment in this paper,combined with the semantic information provided by the convolutional neural network for semantic segmentation,uses the improved Bayesian method for semantic association,realizes the conversion of images from 2D to 3D,and builds a more accurate 3D image.Semantic model;positioning optimization and updating in the Octo-map environment to build a consistent 3D semantic map.The method in this paper is tested on the public data set of indoor environment.Compared with the traditional visual SLAM algorithm,this method has a certain improvement in the overall mapping accuracy and speed in a complex environment;it reduces the impact of illumination transformation,and not only has good real-time performance,and has high application value. |