Research On Deep Learning-Based Multi-View 3D Reconstruction Algorithms

Posted on:2024-09-17

Degree:Master

Type:Thesis

Country:China

Candidate:M Z Shen

Full Text:PDF

GTID:2568307121473304

Subject:Engineering

Abstract/Summary:

PDF Full Text Request

Three-dimensional reconstruction,as one of the hot research directions in computer vision,plays an important role in many applications such as robotics,autonomous driving,SLAM,virtual reality,and artificial intelligence.Three-dimensional reconstruction can be divided into voxel-based reconstruction,mesh-based reconstruction,and point cloud-based reconstruction.This study chooses to focus on point cloud-based reconstruction.Currently,three-dimensional point cloud reconstruction can be categorized into traditional methods based on geometric and photometric consistency,as well as deep learning-based methods.Traditional algorithms for point cloud reconstruction have matured and offer a relatively simple workflow with controllable costs.However,these methods require manual design of complex feature matrices and are mainly targeted towards ideal Lambertian surfaces.As a result,when dealing with complex real-world objects,these methods often struggle to achieve satisfactory universality and accuracy.On the other hand,deep learning-based point cloud reconstruction algorithms utilize neural networks to extract image features and calculate feature volumes relative to the world coordinate system using camera parameters and homography transformations.By constructing cost volumes based on the feature volumes of source and reference images and continuously learning and refining these cost volumes,these algorithms optimize the generated depth maps to obtain accurate point cloud models.Compared to traditional methods,deep learning-based approaches exhibit significant improvements in terms of universality,reconstruction accuracy,and handling complex scenes.However,existing learning-based algorithms still face challenges such as missing feature map information,missing cost volume information,and noise interference.To address these challenges,this study proposes a cascaded network called Att MCVA-MVSNet,which consists of three modules: feature selection and processing module,multi-cost volume aggregation module,and depth consistency regularization module.Att MCVA-MVSNet improves upon existing networks in these three modules:(1)To tackle the issue of missing information in input feature maps,the feature selection and processing module utilizes attention mechanisms to capture semantic information and contextual connections within the feature maps.This enhances the quality of the input feature maps,thereby improving the network’s feature representation capability.(2)To address the problem of information loss caused by cost volume construction based on variance,the multi-cost volume aggregation module employs a grouped vector dot product method to calculate the similarity between feature maps from different viewpoints and constructs multiple cost volumes.Neural networks are then used to learn the weight information of each cost volume,which are subsequently aggregated by weighted summation.Multiple cost volumes preserve more point cloud information,resulting in improved reconstruction accuracy.(3)In order to enhance the information exchange between different stages of the cascaded network,the previous stage’s cost volume is used as guidance information.A difference matrix is constructed between the previous stage’s cost volume and the current stage’s cost volume.Through attention mechanisms,effective information within the matrix is learned and utilized to guide the construction of the current stage’s cost volume,achieving regularization of the cost volumes.Experiments were conducted on the DTU dataset and the Tanks And Temples dataset,noise reduction optimization was performed before training the network using the DTU dataset.On the DTU dataset,Att MCVA-MVSNet achieved a precision of0.356,a completeness of 0.330,and an overall score of 0.343,outperforming other methods in evaluation.On the Tanks And Temples dataset,it ranked first among all selected methods in terms of the average F-score for each scene,demonstrating excellent generalization ability.The experimental results show that Att MCVA-MVSNet exhibits superior reconstruction performance and generalization ability compared to other methods.

Keywords/Search Tags:

three-dimensional reconstruction, deep learning, attention mechanism, cost volume, depth consistency

PDF Full Text Request

Related items

1	Multi-View 3D Reconstruction Based On Deep Learning
2	Deep Learning Based Research On Depth Map Super-Resolution Reconstruction
3	Research On 3D Reconstruction Method In Images Based On Deep Learning
4	Research On Image Depth Estimation Based On Deep Learning
5	Research On 3d Reconstruction Method Based On Multi-view Depth Estimation
6	Research On Monocular Depth Estimation Based On Deep Learning
7	Research On 3D Point Cloud Reconstruction Algorithm Based On Depth Image Prediction
8	Research On RGBD Image Enhancement Based On Deep Learning
9	Research On Deep Image Reconstruction Method Based On Semi-Supervised Learning
10	A Study On Depth Super-Resolution Based On Deep Learning