Font Size: a A A

Research On 3D Reconstruction Method Based On Image Semantic Scene Completion

Posted on:2024-04-28Degree:MasterType:Thesis
Country:ChinaCandidate:Y L GuoFull Text:PDF
GTID:2568306935484384Subject:Computer Science and Technology
Abstract/Summary:PDF Full Text Request
Semantic scene complementation is a computer vision and machine learning technique that combines semantic segmentation and shape complementation,and aims to infer a complete3 D model of a scene with semantic information from the incomplete scene information of the input image.Compared with traditional sensor-based 3D reconstruction techniques,indoor 3D semantic reconstruction using semantic scene complementation on RGB-D images has become an important research direction,which has important research value in the fields of architecture,industrial design,virtual reality and medicine.In this thesis,the highly coupled and mutually reinforcing semantic segmentation and shape complementation are transformed into the same task,and a lightweight deep learning network RIBNet is proposed to realize the 3D reconstruction task from RGB-D images to 3D reconstruction with semantic information.To address the problem that processing image features in 3D space using the original residual blocks leads to a large amount of network computation,this thesis proposes an improved 3D residual block and combines the improved 3D residual block into the Inception module,with the aim of saving the amount of network computation and number of parameters in 3D space while the same layer of the designed network can extract feature information of different size perceptual fields using convolutional kernels of different sizes.To address the problem that the existing multimodal semantic scene complementation methods do not make full use of multimodal information and lead to poor complementation accuracy,this thesis designs a cascaded multilevel multimodal feature fusion method,which makes full use of and fuses color and depth information to enhance the semantic reconstruction capability of the network.For the problem of noise and sparsity of depth images in the real scene dataset NYUv2,this thesis uses bilateral filtering algorithm for noise reduction,and introduces the unguided depth complementation algorithm HMS-Net and RGB-guided depth complementation algorithm GuideNet to generate more friendly NYU-HMS and NYU-Guide datasets for semantic scene complementation research.In this thesis,we design experiments to analyze the 3D semantic reconstruction results of RIBNet on real scene dataset NYUv2 and synthetic scene dataset SUNCG,and also design ablation experiments to verify that the proposed method can effectively fuse RGB image information and depth image information,and verify the effect of depth information integrity on the semantic scene complementation results.The results show that the semantic scene complementation model RIBNet proposed in this thesis achieves an average Io U value of 34.6% and 53.8% on the NYUv2 dataset and SUNCG dataset,respectively,which is a significant improvement compared to the SSCNet and SATNet benchmark methods.After the two depth-completion operations,the proposed model achieves an average Io U value of 34.68% and 37.53% on the NYU-HMS and NYU-Guide datasets,respectively,thus demonstrating the enhancement of the dense depth maps generated by the two depth-completion methods for RIBNet.
Keywords/Search Tags:Semantic Segmentation, Shape Completion, 3D Reconstruction, Semantic Scene Completion, Computer Vision, Deep Learning
PDF Full Text Request
Related items