Font Size: a A A

Research On 3d Reconstruction Method Based On Multi-view Depth Estimation

Posted on:2024-03-26Degree:MasterType:Thesis
Country:ChinaCandidate:H ZhaoFull Text:PDF
GTID:2568307094459484Subject:Computer technology
Abstract/Summary:PDF Full Text Request
With the continuous development of technology,3D reconstruction technology has been widely used in people’s daily lives.Areas such as robot navigation,virtual reality,augmented reality,medical image processing,and unmanned driving all require the use of 3D reconstruction technology to transform 2D data into realistic 3D models.After years of research,significant progress has been made in 3D reconstruction technology.However,as people’s living standards continue to improve,the demand for more precise and realistic 3D models reconstructed from 2D images is also increasing.Therefore,how to enable computers to reconstruct more detailed and realistic 3D models from 2D images remains a significant issue in the field of computer vision research.Multi-view stereo reconstruction based on deep learning is a method that uses image information from multiple perspectives to reconstruct 3D models.Feature extraction and cost volume generation are two critical processes in the reconstruction process.The quality of the feature extraction determines the quality of the subsequent cost volume generation,and the effectiveness of the cost volume generation directly affects the accuracy of the final 3D reconstruction.This article addresses the problems of poor feature extraction and high memory consumption of cost volume generation in existing methods and proposes improvements to the multi-view depth estimation-based3 D reconstruction network.The main research content is as follows:1.In order to obtain more accurate feature information and improve feature matching,this article introduces a channel-spatial attention mechanism into the feature extraction network based on the feature pyramid.Deep estimation is completed using a cascade structure from coarse to fine across three different scales of feature maps.At the same time,adaptive cost aggregation is performed by assigning different weights to the cost volume produced by different views based on their similarity weights.Experiments on the DTU dataset show that this method improves the completeness metric by 5.4% compared to the original model.2.To improve the problem of overly smooth edges and inaccurate depth information caused by the large receptive field in the upsampling and 3D convolution regularization process,this article constructs a residual structure to improve the edge depth information of the initial sparse depth map based on the feature map details.The experimental results show that this method produces more refined results for edge details such as model boundaries and textures.3.In order to reduce the huge memory consumption caused by 3D convolution in the regularization process of the cost volume,and to reduce the computational complexity,this article uses an adaptive spatial cost aggregation method to regularize the cost volume in the last stage of the cascade structure instead of 3D convolution.The depth of neighboring pixels is optimized for the center pixel,and to prevent crossboundary aggregation problems,deformable convolution is introduced to adaptively select neighboring pixels with higher feature correlation for pixels with larger changes in edge depth values.The experimental results show that this method effectively reduces memory consumption and enables more efficient 3D reconstruction.
Keywords/Search Tags:Deep learning, 3D reconstruction, Multi-view Stereo, Depth map, Attention mechanism
PDF Full Text Request
Related items