3D Semantic Mapping Based On Deep Learning And Monocular Visual SLAM

Posted on:2019-12-10

Degree:Master

Type:Thesis

Country:China

Candidate:H X Ao

Full Text:PDF

GTID:2428330596961314

Subject:Precision machinery and instruments

Abstract/Summary:

PDF Full Text Request

With the development of artificial intelligence,machines are beginning to replace humans to accomplish certain tasks,such as sweeping robots,driverless cars,and industrial robots.The perception and understanding of environment is the base and requirement for automation of intelligent systems.Sensors,such as lasers and cameras,are used as "eyes" of machine to capture surrounding information and computers are used as "a brain" to understand the scene.Especially,the visual data is particularly rich in environmental information.How to use them to help machine to understand scenes as a way of human beings is a critical problem in computer vision and artificial intelligence.To solve this problem,this paper uses monocular visual data with SLAM and deep learning to establish a 3D semantic scene model,which presents 3D and semantic information of scene,and guides machines to work autonomously.This paper focuses on visual information to build 3D semantic scene model.First of all,the visual SLAM technology is discussed.The monocular visual “LSD-SLAM” algorithm is selected for building 3D map.Second,we discusses the semantic segmentation technology based on deep learning,analyzes the compression and optimization of deep convolutional neural networks,and proposes a high-performance semantic segmentation network.Finally,semantic segmentation is merged with LSD-SLAM algorithm to convert the 2D image and semantic information into 3D space,and the 3D semantic model is constructed and optimized.In the research of 2D image semantic segmentation,aiming at the efficiency deficiency of traditional semantic segmentation models,we study on compression method of convolutional neural network model,and compresssemantic segmentation model by depth-wise separable convolution and multi-size-kernel fusion convolution.In addition,the multi-scale features are combined with the modules as the pyramid pooling and the optimized residual model.Then,a high-performance semantic segmentation network was designed.With the experimental evaluation,it has been proved that the model achieves a good performance in the model size,speed and accuracy.In the research of 3D semantic scene modeling,the semantic segmentation is used to semantically label the scene at the pixel level,and then combined with LSD-SLAM algorithm.This is aimed to calculate and optimize the depth of semantic frame and convert labels from 2D to 3D.Finally a 3D map with semantic information is constructed.In this process,the Bayesian method is used to fuse semantic tags,and the entire 3D semantic model is globally optimized by dense Conditional Random Field(CRF).Experiments show that the method can accurately and effectively construct the 3D semantic model under both the KITTI data set and laboratory private data,and our method can run in real time at a frequency of 20 Hz on the experimental platform.

Keywords/Search Tags:

computer vision, monocular visual SLAM, deep learning, semantic segmentation

PDF Full Text Request

Related items

1	Research On Monocular Visual Semantic SLAM In Complex Environment
2	Visual Semantic SLAM Based On Deep Learning
3	Research On Monocular Visual SLAM Algorithm Combined With CNN And IMU In Complex Environments
4	Visual SLAM 3D Modeling And Application Research Based On Scene Semantic Information
5	Research On Semantic SLAM System Based On Deep Learning
6	Three-Dimensions Semantic Map Construction Based On Stereo Vision
7	Research On Dynamic Semantic Multi-graph SLAM Algorithm Based On Monocular Vision
8	Semantic SLAM Based On Monocular Visual SLAM And Object Detection
9	Research On Lightweight Semantic SLAM Based On Deep Learning
10	Research On Mobile Robot SLAM Algorithm Based On Image Semantic Segmentation