Font Size: a A A

Semantic-based Indoor RGB-D Camera Mapping And Reconstruction

Posted on:2020-03-20Degree:MasterType:Thesis
Country:ChinaCandidate:Q WangFull Text:PDF
GTID:2428330599458978Subject:Control Engineering
Abstract/Summary:PDF Full Text Request
In recent years,SLAM technology has achieved rapid development,and has broad application prospects in industrial robots,AR,VR technology,and smart cars.Visual SLAM completes the agent's understanding of the geometric information of the environment,but ignores the understanding of environmental semantic information.Because the visual SLAM technology and the image semantic segmentation technology are complementary: the former pays attention to the geometric information of the scene,and the latter pays attention to the semantic understanding of the scene.Therefore,this paper combines visual SLAM technology with image semantic segmentation technology to study camera positioning and reconstruction technology based on semantic information.Due to the low cost of the RGB-D camera,it is easy to construct a dense map to reflect the integrity of the reconstructed scene,and the use distance is limited.Therefore,this paper studies the RGB-D camera positioning and reconstruction technology in indoor environment.The starting point of this paper is to combine semantic segmentation with classical visual camera positioning and reconstruction,and assist each other to construct a more accurate three-dimensional semantic map.Firstly,in the feature point matching stage of two RGB images of visual odometer,there are many mismatching problems for feature points.The results of semantic segmentation are proposed,and feature points are matched by sub-categories,so that the matching pairs participating in camera motion estimation are more accurate.Then,in the stage of re-projection to solve the camera attitude,the problem of re-projection error optimization is proposed for mapping the erroneous three-dimensional space points into RGB images.It is proposed to use the semantic label information to reduce the projection range and reduce the mismatch matching optimization,so that the camera pose is solved more.accurate.Secondly,in thelocal and global optimization stage,the semantic label is used as a priori information,so that the inherent geometric properties of the objects separated by walls,floors,ceilings,etc.,when the coordinates of the camera pose and the three-dimensional space point are jointly optimized,the same category is constrained.The 3D point makes the camera pose and 3D point optimization more perfect.Finally,the pixel class prediction probability of multiple associated key frames is merged,and the depth information provided by the RGB-D camera is used to map the two-dimensional semantic tags into the three-dimensional space through camera positioning,thereby establishing a more accurate and complete semantic map.The consistency of camera positioning improves the accuracy of single-frame semantic segmentation.At this point,instead of changing the camera pose estimation result,the reconstructed point cloud is strongly constrained,and the principal component analysis method is used to adjust the point cloud to the position closest to the real scene.This paper not only makes full use of semantic information to assist in feature matching,camera tracking,global optimization,and mapping in camera positioning and reconstruction,which makes camera positioning and reconstruction system more accurate and more complete.It also makes the predicted probability after fusion more accurate than the original single frame prediction.Thus,the semantic segmentation technology is more closely integrated with the camera positioning and reconstruction technology.
Keywords/Search Tags:Semantic segmentation, Similar match, Geometric constraint, Bayesian fusion, Simultaneous Localization and Mapping
PDF Full Text Request
Related items